Introduction to engineer experimentation pdf download
Experimental systems may have a ground point that is not connected to earth ground. Ground is simply a common connection point that keeps grounded components at the same potential. The power-plug ground may connect to the actual earth ground a great distance from the experimental setup. In addition, power grounds usually run in conduits parallel to power conductors, which induce noise voltages. Keep the two signal wires very close to each other.
It is preferable that the two signal wires actually be twisted around each other twisted-pair wires. Avoid getting power and signal wires too close to each other. If possible, use a single ground for the entire experimental setup. Each shield should be connected to ground with a single connection.
Use high-quality power supplies. In some cases it may be better to use batteries as power supplies for instruments instead of ac line-powered power supplies to minimize power-line interference noise. One way to do this is to observe the results on the indicating or recording device with a static measurand applied to the transducer.
Any variation in output with time represents noise. With improperly wired millivolt systems, it is not unusual to find noise with a larger amplitude than the desired signal.
These transducers are normally more expensive than low-level-output transducers, but the signals are much less susceptible to interference. The reason they are more expensive is that an instrumentation amplifier is normally included directly in the transducer.
These high-level signals can normally be transmitted for a distance of 30 to ft without major problems. Over the range of the transducer, this current will vary between 4 and 20 rnA. A typical current-loop system shown in Figure 3.
This current is then converted to a voltage at the receiving end. The minimum current of 4 rnA means that open circuits in the wiring are easy to detect - a reading of 0 rnA is possible only under open-circuit conditions. The current signals are also much less susceptible than analog voltage signals to environmental noise since the power associated with current signals is very large compared to most analog voltage systems.
Long cable lengths will, however, constrain the upper-frequency limit of the signals. In digital signal transmission, the informatlOn m the transducer signal is converted to a series of voltage pulses, called bits, which transmit the information in digital code. If the voltage of the pulse exceeds a certain level, the pulse is 'on'; if the voltage is below another level, the pulse is 'off.
When such systems are designed properly, the signals are almost immune to problems from environmental noise. Signals from satellites visiting planets millions of miles away are transmitted successfully in digitally coded radio waves. In these cases the background noise is very strong relative to the radio signal, yet it is still possible to extract high-quality data.
Instrumentation for Engineering Measurements, 2d ed. Grounding and Shielding Techniques in Instrumentation, 3d ed. Guide to Electronic Measurements and Laboratory Practice, 2d ed.
An amplifier produces an output of when the input is 5 What are the gain G and the decibel gain GdB? If the input voltage is 3 mY, what is the output voltage? What is the gain in dB for each of these selections? The output for the same force is 4. What is the output impedance of the force transducer?
It is to have a gain of Specify values for the two resistors. What will be the gain and phase angle at 10 kHz? What will be the cutoff frequency and the values of R1 and R2? What will be the roll-off in dB per decade? Specify values of the resistors in an attenuation network such that the loading error of the voltage at the output terminals of the generator is 0.
The output impedance of the generator is 10 and the filter has an input impedance of ko'. The maximum input voltage to the data acquisition system is 8 V and its input impedance is 1 Mo'. The output impedance of the power line circuit is 0. If the output amplitude of a 3-kHz sine wave is 0. V, 70 Chapter 3 Measurement Systems with Electrica l Signals A recording device has a frequency response which shows that the output is down 2 dB at Hz.
If the actual input is 5. Superimposed on this signal is Hz noise with an amplitude of 0. Select a filter order to perform this task if the corner frequency is 10 Hz.
Superimposed on this signal is an additional signal with a frequency of Hz and an amplitude of 0. Using a corner frequency of Hz, select a filter order to perform this task.
What is the input voltage? What is the range of the oscilloscope on that scale? The vertical scale is set to 5 mV per division. What would you consider to be a reasonable estimate of resolution of this device in mV?
In this chapter we describe the basic components and operation of computerized or a similar device is often used to point to regions on the screen to communicate with the computer. Some computers have what are known as touch screens, in which the user can touch regions of the screen to communicate.
Usually, computers have some kind of printer in order to produce printed output hard copy. The method of printing depends on the quantity and quality of printing required. Optical devices such as laser printers are common for all types of computers.
Moderately priced ink jet printers usually have the capability to print in color, which is often desirable in experimental work. Computers also have features that enable the user, optionally, to connect to other devices. Inside the computer, it is possible to connect to the bus, a series of conductors connecting the internal components of the computer.
Outside the computer, there are generally plug connections known as ports, which are connected to the bus internally. The components that convert a computer to a computerized and is assigned a numerical value of 1 , and the other state i s defined a s 'off' and i s assigned a numerical value of O. A series of flip-flops are required to represent a number. For example, the binary number , which corresponds to the decimal number 9, can be represented in a computer using four flip-flops.
Each of these flip-flops represents a 'bit' of the number. The leftmost ' I ' in the binary number is the most significant bit MSB. The rightmost ' I ' is the least significant bit LSB. There is a one-to-one correspondence between binary numbers and decimal numbers. Example 4. The maximum allowable positive output is 2N12 - 1 , and the maximum negative output is -2NI2. If the computed output is outside these limits, the converter is saturated and the actual output will have the value of the nearest limit.
B of Figure 4. So lution: Since this is a simple binary device, Eq. Find the output codes when the input is - 1 1 , -5, 0, 6. So lution: Substituting into Eq. A in Figure 4. The value 12 is above the input range, so the output is the largest output state, which is for a bit converter. The experimenter should be suspicious of any result at the limits of the range. For example, an analog input of - 13 V would produce the same output as the - 1 1-V input in the table.
S LSB. The user must select a converter with suitable resolution to obtain acceptable accuracy. So lution: Using Eq. A simple, but not very efficient, approach would be to have separate sensors and displays located at each station with the values manually checked periodically to ensure that things are operating smoothly.
A more elegant and useful approach, however, would to be combine all of the signals together and display them on single screen. With Ethernet connections, these components can even be spread across several states or countries. In the above scenario, the custom software may consist of an on-screen schematic of the piping system with the current pressure and temperature displayed at each measurement station. The interface may also allow for various settings to be changed, such as the closing or opening of a valve, by simply clicking on a button located near the valve in the schematic.
Virtual instruments are recognized in contrast to measurement hardware that has a predefined function. For example, a digital multimeter uses an AID converter and a digital display to read and display a voltage level.
Other functions, such as measuring AC voltage amplitude are hard-wired into the device and new functions cannot be added as needed. A computer with an AID convertor and appropriate software can also perform the same functions, but provide additional flexibility and customizability by exploiting the capabilities of the computer to which it is attached. It would also be possible to display multiple signals on a single screen or perform mathematical operations on the signals.
Figure 4. The theorem also specifies methods that can be used to reconstruct the original signal. The amplitude in Figure 5. The sampling-rate theorem has a well-established theoretical basis. There is some evidence that the concept dates back to the nineteenth-century mathematician Augustin Cauchy Marks, The theorem is often known by the names of the latter two scientists. A comprehensive but advanced discussion of the subject is given by Marks This process will be discussed in some detail later in the chapter.
Even if the signal is correctly sampled i. For example, Figure 5. The sampled data are shown as the small squares. However, these data are not only consistent with a Hz sine wave but in this case, the data are also consistent with Actually, there are an infinite number of higher frequencies that are consistent with the data.
The higher frequencies can be eliminated from consideration since it is known that they don't exist. In some cases, the requirements of the sampling-rate theorem may not have been met, and it is desired to estimate the lowest alias frequency.
The lowest is usually the most obvious in the sampled data. A simple method to estimate alias frequencies involves the folding diagram as shown in Figure 5. The use of this diagram is demonstrated in Example 5. Example 5. The lowest alias frequency is the difference between frequency. In part b , the sampling frequency is less than the signal frequency. The folding diagram is the simplest method to determine the lowest alias frequency.
In part c , the requirement of the sampling-rate theorem has been met, and the alias frequency is in fact the signal frequency. To know that the frequency is correct, we must insure that the sampling rate is at least twice the actual frequency, usually by using a filter to remove any frequency higher than half the sampling rate. The process of determining these component frequencies is called spectral analysis.
There are two times in an experimental program when it may be necessary to perform spectral analysis on a waveform. The first time is in the planning stage and the second is in the final analysis of the measured data.
In planning experiments in which the data vary with time, it is necessary to know, at least approximately, the frequency characteristics of the measurand in order to specify the required frequency response of the transducers and other instruments and to determine the sampling rate required. In many time-varying experiments, the frequency spectrum of a signal is one of the primary results.
To examine the methods of spectral analysis, we first look at a relatively simple waveform, a simple Hz sawtooth wave as shown in Figure 5.
At first, one might think that this wave contains only a single frequency, Hz. However, it is much more complicated, containing all frequencies that are an odd-integer mUltiple of , such as , , and Hz. The lowest frequency, to, in the periodic wave shown in Figure 5. Of course, Eq. If J t is even, it can be represented entirely with a series of cosine terms, which is known as a Fourier cosine series. If fit is odd, it can be represented entirely with a series of sine terms, which is known as a Fourier sine series.
Many functions are neither even nor odd and require both sine and cosine terms. If Eqs. These have frequencies of , , , and Hz, respectively. Figure 5. As can be seen, the sum of the first and third harmonics does a fairly good job of representing the sawtooth wave. The main problem is apparent as a rounding near the peak- a problem that would be reduced if the higher harmonics e. If, for example, the experimenter considers the first-plus-third harmonics to be a satisfactory approximation to the sawtooth wave, the sensing instrument need only have an upper frequency limit of Hz.
Solution: The fundamental frequency for this wave is 10 Hz and the angular frequency, w is Also, by examination, we can conclude that it is an odd function and that the cosine terms will be zero and only the sine terms will be required. Using Eq. One problem associated with Fourier-series analysis is that it appears to only be useful for periodic signals.
In fact, this is not the case and there is no requirement that f t be periodic to determine the Fourier coefficients for data sampled over a finite time. We could force a general function of time to be periodic simply by duplicating the function in time as shown in Figure 5.
If we directly apply Eqs. However, if the resulting Fourier series were used to compute values off t outside the time interval O-T, it would result in values that would not necessarily and probably would not resemble the original signal.
The analyst must be careful to select a large enough value of T so that all wanted effects can be represented by the resulting Fourier series.
An alternative method of finding the spectral content of signals is that of the Fourier transform, discussed next. The Fourier transform is a generalization of Fourier series. The Fourier transform can be applied to any practical function, does not require that the function be periodic, and for discrete data can be evaluated quickly using a modern computer technique called the Fast Fourier Transform.
In presenting the Fourier transform, it is common to start with Fourier series, but in a different form than Eq. This form is called the complex exponential form. These relationships can be used to transform Eq. L cn ejnwot n - oo 5. In Section 5. If a longer value of T is selected, the lowest frequency will be reduced. This concept can be extended to make T approach infinity and the lowest frequency approach zero. In this case, frequency becomes a continuous function. It is this approach that leads to the concept of the Fourier transform.
The Fourier transform of a function. Use of Filtering to Lim it Sampling Rate Many measurement systems effectively eliminate unwanted high frequencies at the sensing stage. Most other sensing devices will show an attenuated response if the frequency is high enough.
In many of these systems, the sampling rate can be made sufficiently high to avoid aliasing effects. As an example, consider a measuring device being used to measure a quasi-steady temperature.
Equation 5. The sum of these 20 Fourier components is presented on Figure E5. The agreement between the data and the Fourier series is considered adequate. As the AID converter is 8 bit and bipolar, it has a dB dynamic range. Since we require 42 dB, we can use Eq. The minimum sampling rate can then be determined from Eq.
Comments: It might be desirable to increase the sampling rate and use a lower-order filter. Filters have significant phase distortion within the filter bandpass, and this distortion would be less if the filter order were lower. This selected filtering is quite conservative. Their amplitude can be reduced to less than the AID converter threshold with lesser attenuation.
It would thus be possible to use a lesser order of filter or a lower sampling rate using a more detailed analysis. Actually, the sampling constraints discussed above may be too stringent for some applications. This approach, which allows some distortion, should be undertaken only if the experimenter has a good understanding of the analog signal and the experimental requirements. The Fourier transform, Scientific American, June. Republished by Dover, New York, Understanding Digital Signal Processing.
Discrete-Time Signal Processing 2nd ed. Computer Based Data Acquisition Systems, 2d ed. Using direct integration, evaluate the Fourier coefficients ao, at. Could you have deduced the values of ao, aI, and a2 without performing the integrations?
Evaluate the Fourier constants ao , aI , a2 , bI , and hz by direct integration. For the ramp function described in Problem 5. Use equally spaced time intervals. Specify a periodic form of this function so that only sine terms will be required for a Fourier-series approximation. Specify a periodic form of this function so that it can be represented entirely with a Fourier cosine series. Plot this sum and compare it to the original function.
Perform an FFT on the results and find the magnitude of the coefficients as produced by the FFT; they will be complex. Plot the results versus frequency in a bar chart such as Figure 5. Interpret the peaks observed.
Perform an FFT on the results and find the magnitude of the coefficients as produced by the FFT, they will be complex. Perform a FFT on the results, evaluate the magnitude of the coefficients, and present the result in a bar graph such as Figure 5. Plot the original function and the windowed function on the same graph. Perform an FFT on the original results and take the steps to create a bar graph like Figure 5. Perform an FFT on the windowed data and create another bar graph.
What false alias frequency ies would you expect in the discrete data? The function in Problem 5. What would be the minimum sampling rate to avoid false aliasing for the function in Problem 5. What false alias frequencies would you expect in the output? What would be the minimum sampling rate to avoid false frequencies in the sampled data for the function in Problem 5.
Using a spreadsheet program, plot the function of Problem 5. Connect the points with straight lines. Explain the shape of the resulting plot. What would be the lowest expected alias frequency? A 5-kHz sine wave signal is sampled at 3 kHz. A I-kHz sine wave is sampled at 1. What is the lowest expected alias frequency? A 3-kHz sine wave is sampled at 4 kHz. What is the dynamic range of a bit bipolar NO converter? What is the dynamic range of a bit unipolar NO converter?
For an experiment, fc is 10 kHz and the NO converter is bipolar with 12 bits. In a vibration experiment, an acceleration-measuring device can measure frequencies up to 3 kHz, but the maximum frequency of interest is Hz. A data acquisition system is available with a bit bipolar NO converter and a maximum sampling rate of 10, samples per second. Specify the corner frequency and order of a Butterworth antialiasing filter for this application attenuating the signal to the converter threshold and the actual minimum sampling rate.
In a time-varying pressure measurement, the maximum frequency of interest is Hz, but there are frequencies in the signal with significant energy up to Hz.
A first-order Butterworth filter is to be constructed to filter out the higher frequencies. These characteristics of the data in Table 6. Figure 6. Some general guidelines apply to the construction of histograms. See Rees, It is customary to have from 5 to 15 bins. It is simplest if each of the bins has the same width difference between the smallest and largest values in the bin. There are special rules if bins of unequal width are used. The bins should cover the entire range of the data, with no gaps, but the bins should not overlap.
In Figure 6. This example illustrates a random variable that can vary continuously and can take any real value in a certain domain. Such a variable is called a continuous random variable.
The theoretical function can then be used to make predictions about various properties of the data. The population comprises the entire collection of objects, measurements, observations, and so on whose properties are under consideration and about which some generalizations are to be made.
Examples of population are the entire set o. After Johnson, A sample is a representative subset of a population on which an experiment is performed and numerical data are obtained.
For example, 10 light bulbs can be selected from a production batch of 10,, or wind speed can be sampled once each hour over a h period. Different samples of the same population may be chosen for experimentation. Sample space. The set of all possible outcomes of an experiment is called the sample space.
For example, there are six possible outcomes in casting a fair die. If the sample space is made of discrete values such as the outcomes of casting a die or a coin, acceptable and unacceptable products , it is a discrete sample space. A random variable 1 32 Chapter 6 Statistical Analysis of Experimental Data of a discrete sample space is discrete.
If a sample space is a continuum, we have a continuous sample space and also continuous random variables.
The sample space Jr the temperature measurements of gases coming out of a furnace is continuous. Random variable. Engineering experiments and any associated measure nt are influenced by many factors that cannot be totally controlled, and as a resu!
Two examples 0 lch uct experiments are the measurement of temperature of a hot gas flowing through and the life expectancy of light bulbs. The value of temperature is a function of any factors, including the operation of the heating source, duct insulation and enviro ent, the and, more important, the nature of the flow itself and the measurement device and case of the bulb, variation in material properties, manufacturing proce measurement process can influence the measured life of the bulbs.
In each [ the experiments mentioned, no matter how well we control the influencing par: :! The variables being measured temperature and lifetimf these cases are considered random variables. Mathematically, a random varia : is a numerically valued function defined for the population.
A random variable can be continuous or discrete. The variables in the 1t bulb and duct temperature examples are continuous random variables. In principII ley can assume any real value. Discrete random variables have a countable number of possible values.
Distribution function. A distribution function is a graphical or m; relationship that is used to represent the values of the random variable. A parameter is a numerical attribute of the entire popu! As an :r for that Event. A statistic is a numerical attribute of the sample. For average value of a sample property is a statistic of the sample. Probability is the chance of occurrence of an experiment. The probability is obtained by dividing the number of success! This is 6. For a population with a finite number of elements, N, with values Xi , the mean is denoted by the symbol J.
Two other common parameters describing central tendency are the median and the mode. If the measurands are arranged in ascending or descending order, the median is the value at the center of the set. The mode is the value of the variable that corresponds to the peak value of the probability of occurrence of the event. For some distributions e. When a distribution has more than one mode, the frequencies of occurrence of each mode need not be the same.
While it is common for the mean, median, and mode to have close to the same value although they will generally not have exactly the same value , in some data sets they may have significantly different values.
For example, a set of measurements ranging from 90 to 1 10 has a greater dispersion than a set of measurements ranging from 95 to 1 If the desired event occurs, the outcome is considered successful.
The prediction of the probability of an event is one purpose of statistical analysis. Probability is always a positive number with a maximum of 1: 0 ::s; P x or Xi ::s; 1. If event A is the complement of event A this means that if event A occurs, event A cannot occur , then 6. If the events A and B are mutually exclusive i. For example, after testing a sample from a batch of light bulbs for time to failure, we may want to know the probability that an additional bulb selected from the batch will have a time to failure of less than a certain value.
In our duct temperature data in Table 6. We can replot the data of Figure 6. Relative frequency is the number of samples in each bin, divided by the total number of samples. Since the relative frequency of the bin for to is 0. However, such an approach has decided limitations. This is particularly true for the bins with only one or two samples. Although, intuitively, we expect some finite, small probability for temperatures less than or greater than , the use of the sample data directly would not result in reasonable estimates of these probabilities.
For continuous random variables, the functions are called probability density functions. For most experimental situations, the appropriate distribution function can be determined from experience.
This technique is presented in many statistical texts, such as Harnett and Murphy It is used to determine the probability that a random variable has a value less than or equal to a specified value. Example 6. Solu tion: a Using Eq.
Either by substituting 15 into the equation or by reading from the graph, we find that the probability that the lifetime is less than 15 h is 0. Binomial Distribution The binomial distribution is a distribution which describes discrete random variables that can have only two possible outcomes: 'success' and 'failure. The following conditions need to be satisfied for the binomial distribution to be applicable to a certain experiment: L Each trial in the experiment can have only the two possible outcomes of success 2.
The probability of success remains constant throughout the experiment. The experiment consists of n independent trials. Determine the probability that in a batch of 20 computers, 5 will require repair during the warranty period. Success will be defined as not needing repair within the warranty period. Other assumptions underlying the application of this distribution are that all trials are independent and that the probabilities of success and failure are the same for all computers.
Using Eqs. If we buy four of these bulbs, what are the probabilities of finding that four, three, two, one, and none of the bulbs are defective? Again, we can use the binomial distribution. The probability of having four, three, two, one, and zero defective light bulbs can be calculated by using Eq. Solution: We use Eq. Poisson Distribution The Poisson distribution is used to estimate the number of random occurrences of an event in a specified interval of time or space if the average number of occurrences is already known.
For example, if it is known that, on average, 10 customers visit a bank per five-minute period during the lunch hour, the Poisson distribution can be used to predict the probability that 8 customers will visit during a particular five-minute period.
The Poisson distribution can also be used for spatial variations. For instance, if it is known that there are, on average, two defects per square meter of printed circuit boards, the Poisson distribution can be used to predict the probability that there will be four defects in a square meter of boards. The following two assumptions underline the Poisson distribution: 1.
The probability of occurrence of an event is the same for any two intervals of the same length. The probability of occurrence of an event is independent of the occurrence of other events. To compute the probability that the number of occurrences is less than or equal to k, the sum of the probability of k, k 1.
This is given by - 6. Find the probability that there are exactly zero errors during a one-minute period. So lution: For this problem, A is 3 and x is O.
Substituting into Eq. Solution: Frrst we need to compute the probability of 0, 1 , 2, and 3 errors using Eq. What is the probability that there will be a a single defect in a weld that is 0. A, the average number of defects in 0. Normal Gaussian Distribution The normal or Gaussian distribution function is a simple distribution function that is useful for a large number of common problems involving continuous random variables.
The normal distribution has been shown to describe the dispersion of the data for measurements in which the variation in the measured value is due totally to random factors and occurer nces of both positive and negative deviations are equally probable. According to the definition of the probability density function [Eq. To simplify the numerical integration process, the integrand is usually modified with a change of variable so that the numerically evaluated integral is general and useful for all problems.
This normalized function is plotted in Figure 6. Taking the differential of Eq. Equation 6. On the other hand, the integral in Eq. If Zl in Eq.
The results of this integration are shown in Table 6. The table presents the value of the probability that the random variable has a value between 0 and Z for Z values shown in the left column and the top row. The top row serves as the second decimal point of the first column. The process of using Table 6. Find the probability that a single reading is a between 9 and From Table 6. The result we seek is the difference between these two areas: e P -2 s z s For 20', the probability is The concepts of confidence intervals and confidence level are discussed in greater detail later in the chapter.
The average diameter of the cylinders, JL , has been measured to be 4. What are the probabilities of the following cases? Then c Zl P Random variable X Standard Lognormal Distribution In some experimental situations, the measured variable is restricted to positive values and the values of this variable are usually small but can occasionally be very large. When plotted as histogram, these data give rise to skewed distributions with very long upper tails as depicted in Figure 6. Such variables can sometimes be efficiently described by a lognormal distribution.
This distribution is demonstrated in the following examples. From the Table 6. The data indicates that the waiting time has a mean value of 2. The exponential distribution looks at the same underlying process but examines the duration of the interval between successive occurrences of a randomly occurring event.
Since the interval between the events is a continuous variable, the exponential distribution is a continuous distribution. The exponential distribution is widely used to describe the probability of failure of components that fail at a constant rate, regardless of their age. These components fail because of causes that occur randomly rather than because of systematic wear. For instance, electrical instruments used in engineering experimentation may often fail not because they are old but because they are accidentally dropped or are exposed to high voltage.
An exponential distribution is useful in describing this kind of instrumentation failure, and is used in Reliability Engineering for describing the probability of failure of components that fail due to randomly occurring causes. The failure of a mechanical component, on the other hand, is often due to wear caused by aging and should not be modeled by exponential distribution. In other words, if the time to failure of a component is described by this distribution, the time from the present that it will take for the component to fail is independent of the present age of the component.
For Exponential Distribution 6. Find the probability that a given LED will fail a within days b between and days c will last at least days.
Solution: From the given data, p. Other Distribution FlIDctions Several other types of probability distribution functions are used in engineering experiments. Detailed discussion of all these functions are beyond the scope of this book. In Table 6. For details on the subject, the reader is referred to more comprehensive texts on the subject, such as Lipson and Sheth and Lapin This is a discrete distribution. Used to predict the probability of occurrence of a specific number of events in a space or time interval if the mean number of occurrences is known.
Continuous, symmetrical, and most widely used distribution in experimental analysis in physical science. Used for explanation of random variables in engineering experiments, such as gas molecule velocities, electrical power consumption of households, etc.
Continuous, symmetrical, used for analysis of the variation of sample mean value for experimental data with sample size less than For sample sizes greater than 30, Student's t approaches normal distribution. Continuous, nonsymmetrical, used for analysis of variance of samples in a population.
For example, consistency of chemical reaction time is of prime importance in some industrial processes and X2 distribution is used for its analysis. This distribution is also used to determine goodness of fit of a distribution for a particular application. Continuous, nonsymmetrical, used for describing the life phenomena of parts and components of machines. Continuous, nonsymmetrical, used for analysis of failure and reliability of components, systems and assemblies.
Continuous, nonsymmetricaI, used for life and durability studies of parts and components. Continuous, symmetrical, used for estimating the probabilities of random values generated by computer simulation. In some cases, it is also necessary to estimate the population standard deviation, T. An estimate of the population mean, JL, is the sample mean, x [as defined by Eq.
However, simple estimation of values for these parameters is not sufficient. Different samples from the same population will yield different values. Consequently, it is also necessary to determine uncertainty intervals for the estimated parameters.
In the sections that follow, we discuss the determination of these uncertainty intervals. For further details, see Walpole and Myers The interval from x is called the confidence interval on the mean. The central limit theorem makes it possible to make an estimate of the confidence interval with a suitable confidence level Consider a population of the random variable x with a mean value of p.
Each of these samples would have a mean value Xi , but we would not expect each of these means to have the same value. In fact, the x;'s are values of a random variable. The standard deviation of the mean is also called the standard error of the mean. For the central limit theorem to apply, the sample size n must be large.
In most cases, to be considered large, n should exceed 30 Harnett and Murphy, ; Lapin, The following are important conclusions from the central limit theorem: L If the original population is normal, the distribution for the x;'s is normal. If the original population is not normal and if for the x;'s is normal. Since x is normally distributed, we can use the statistic as defined by u-x x - p.
The estimation of the confidence interval is shown graphically in Figure 6. However, we only expect the true value of p.
The confidence level is this probability, 1 - a. Similarly, in other situations we may be more interested in finding, at a given level of confidence, the minimum value that the mean can take. Such questions are answered by one-sided confidence intervals rather than the two-sided confidence intervals given in Eq. As before this means - za.
Determine the one-sided confidence interval for the minimum value of the mean for the same confidence level. Because the number of samples is greater than 30, we can use the normal distribution and Eq.
Based on the nomenclature of Figure 6. We can now use Table 6. The value of Zal2 is 1. Using the sample standard deviation, S, as an approximation for the population standard deviation, U, we can estimate the uncertainty interval on p.
Due to the uncertainty in the standard deviation, for the same confidence level, we 1 56 Chapter 6 Statistical Ana lysis of Experimental Data would expect the confidence interval to be wider. In contrast to the normal distribution that is independent of sample size, there is a family of t-distribution functions that depend on the number of samples. Department of Commerce It is given by the number of independent measurements minus the minimum number of measurements that are theoretically necessary to estimate a statistical parameter.
For the t-distribution, one degree of freedom is used estimating the mean value of x so n - 1. It is unlikely that an engineer would actually use Eq. For purposes of the present course, all required values can be obtained from a single table. As the number of samples increases, the t-distribution approaches the normal distribution.
For lesser values of the distribution is broader with a lower peak. The t-distribution can be used to estimate the confidence interval of a mean value of a sample with a certain confidence level for small sample sizes less than Only the most common values of t are used -those which correspond to common confidence levels. But first we have to calculate the mean and standard deviation of the data. From Eq. Determine how many more systems should be tested. Hence, the solution process is one of converging trial and error.
Then Eq. Using the standard normal distribution, we find from Table 6. This value of t can be used with Eq. Note that with the additional tests, the value of the sample mean, X, may also change and may no longer have the value h. The best estimate of the population variance, 0'2 , is the sample variance, S2. As with the population mean, it is also necessary to establish a confidence interval for the estimated variance.
Consider a random variable x with population mean value IL and standard deviation 0'. If we assume that x is equal to IL , Eq. As with other probability density functions, the probability that the variable X2 falls between any two values is equal to the area under the curve between those values as shown in Figure 6.
Substituting for I' from Eq. In Eq. We look up the upper value. Looking under the column labeled 0. Although the X2 distribution and test have been developed for data with normal distributions, they are often used satisfactorily for populations with other distributions.
The sample mean is 0. If some clear faults can be detected in measuring those specific values, they should be discarded. But often the seemingly faulty data cannot be traced to any specific problem. There exist a number of statistical methods for rejecting these wild or outlier data points.
The basis of these methods is to eliminate values that have a low probability of occurrence. For example, data values that deviate from the mean by more than two or more than three standard deviations might be rejected. It has been found that so-called two-sigma or three-sigma rejection criteria normally must be modified to account for the sample size.
Furthermore, depending on how strong the rejection criterion is, good data might be eliminated or bad data included. The next step is to find a value of 7' from a table given here as Table 6.
According to this method, only one data value should be eliminated. The process can be repeated until no more of the data can be eliminated. It should be noted that eliminating an outlier is not an entirely positive event. The outlier probably resulted from a problem with the measurement system. Determine whether any of the values can be rejected. We will use the Thompson T test to determine possible outliers. The largest and the smallest values are suspected to be outliers.
We can now recalculate S and V to obtain 0. However, in some cases the scatter may be so large that it is difficult to detect a trend. Consider an experiment in which an independent variable x is varied systematically and the dependent variable y is then measured. We would like to determine whether the value of y depends on the value of x. If the results appeared as in Figure 6.
On the other hand, if the data appeared as shown in Figure 6. If the data appeared as shown in Figure 6. There appears to be a trend of increasing Y with increasing x, but the scatter is so great that the apparent trend might be a consequence of pure chance. Fortunately, there exists a statistical parameter called the cOTTelation coefficient, which can be used to determine whether an apparent trend is real or could simply be a consequence of pure chance.
The correlation coefficient, Txy, is a number whose magnitude can be used to determine whether there in fact exists a functional relationship between two measured variables x and y. For example, one would not expect even a weak correlation between exam scores and the height of the student. On the other hand, we might expect a fairly strong correlation between the total electric power consumption in a region and the time of the day. A value of - 1 indicates a perfectly linear relationship with negative slope i.
A value of zero indicates that there is no linear correlation between the variables. Even if there is no correlation, it is unlikely that Txy will be exactly zero. For any finite number of data pairs, pure chance means that a nonzero correlation coefficient is likely.
Harnett and Murphy discuss the process of determining whether a given correlation coefficient is significant, and Johnson also discusses the issue. For practical problems, this process can be simplified to the form of a single table.
Critical values for T have been established that can be compared with the computed Txy. For two variables and n data pairs, the appropriate critical values of T, Tt o have been computed and are presented in Table 6. The values of T in this table are limiting values that tThis approach assumes that r', could be either positive or negative. If we know that r', can be only positive or only negative, then the approach must be modified. See Johnson Conversely, if the experimental value exceeds the table value, we can expect that the experimental value shows a real correlation with confidence level 1 a.
For a given set of data, we obtain 't from the table and compare it with the value of 'xy computed from the data. A value of 'xy less than 't implies that we canno t be confident that a linear functional relationship exists. On the other hand, some functional relationships e. First, a single bad data point can have a strong effect on the value of 'xy.
If possible, outliers should be eliminated before evaluating the coefficient. It is also a mistake to conclude that a significant value of the correlation coefficient implies - Chapter 6 1 68 Statistica l Analysis of Experi mental Data that a change in one variable causes the other variable to change.
Causality should be determined from other knowledge about the problem. IExample 6. The following data for the same car with the same driver were measured at different races: Lap time 8 Ambient temperature OF 40 47 55 62 66 88 First, we plot the data as shown in Figure E6.
From the plot, it looks as though there might exist a weak positive correlation between lap time and ambient temperature, but we can compute the correlation coefficient to determine whether this correlation is real or might be due to pure chance. We can determine this coefficient using Eq. For six pairs of data, from Table 6. Since rxy is less than rr. The calculation of rxy is a feature of some spreadsheet programs.
The user needs only to input two columns of numbers and then call the appropriate function. One of the most common functions used for this purpose is the straight line.
Linear fits are often appropriate for the data, and in other cases the data can be transformed to be approximately linear. As shown in Figure 6. We would like to obtain values of the constants a and b. If we have only two pairs of data, the solution is simple, since the points completely determine the straight line.
However, if there are more points, we want to determine a 'best fit' to the data. The experimenter can use a ruler and 'eyeball' a straight line through the data, and in some cases this is the best approach. A more systematic and appropriate approach is to use the method of least squares or linear regression to fit the data.
Regression is a well-defined mathematical formulation that is readily automated. Let us assume that the test data consist of data pairs Xi, Yi. For each value of Xi, we then have an error 6. Jurgensen pdf. Kitch P. Chapra Dr. Canale P. Indranil Goswami P. Devore PDF. Wight PDF. Montgomery, George C. Runger P. Edwards pdf.
Ghilani, Paul R. Wolf pdf. Oakes, Les L. Leone P. Lindeburg PE pdf. Donatelle P. Wheeler, Ahmad R. Ganji pdf. Kotler, Kevin Lane Keller P. Tortora, Berdell R. Funke, Christine L. Case pdf. Gregory Mankiw PDF. Pierret P. Figliola, Donald E.
Beasley pdf. Beer, E. Russell Johnston Jr. Cornwell, Brian Self P. Logan P. Twomey, Marianne M. Mott, Joseph A. Untener P. Sullivan, Elin M. Wicks, C. Patrick Koelling pdf. Hibbeler P. Bergman, Adrienne S. Lavine, Frank P.
0コメント