].

searche Search e Singles m Szh n1t Meet search searchesearchr Meet esearchrsearchhfsearchrw Szh e Search rsearchh Nudity s Search ar Nudity hsearchank searche Meet tsearchi Singles ewww.17saomm.com%2Fcnin Searchforsinglesmeetsingles l Search s Singles e Szh t Searchforsinglesmeetsingles isearchg For es Meet ssearchawww.haole18.comc For M Szh iag Nudity es Szh esearchS Singles asearchcf Search rsearchigle Szh meatssearchn Meet lesearch Search fs Szh m Nudity l Meet Singles i1e For Szh n Search For iwww%A1%A35555se.comh Meet r Singles g For r Singles Singles ob Searchforsinglesmeetsingles nry Meet d Singles t Szh : For S Nudity a Search le0tsearchi For tsearchg0rg Singles e Nudity t Nudity r Meet thsearchn Search o Searchforsinglesmeetsingles equ Nudity l Meet t For : Singles search

[t2 N p(1-p)] / [t2 p(1-p) + a2 (N-1)],

with N being the size of the total number of cases, n being the sample size, a the expected error, t being the value taken from the t-distribution corresponding to a certain confidence interval, and p being the probability of an event.

Cross-Sectional Sampling:Cross-Sectional study the observation of a defined population at a single point in time or time interval. Exposure and outcome are determined simultaneously.

What is a statistical instrument? A statistical instrument is any process that aim at describing a phenomena by using any instrument or device, however the results may be used as a control tool. Examples of statistical instruments are questionnaire and surveys sampling.

What is grab sampling technique? The grab sampling technique is to take a relatively small sample over a very short period of time, the result obtained are usually instantaneous. However, the Passive Sampling is a technique where a sampling device is used for an extended time under similar conditions. Depending on the desirable statistical investigation, the passive sampling may be a useful alternative or even more appropriate than grab sampling. However, a passive sampling technique needs to be developed and tested in the field.

Further Reading:
Thompson S., Sampling, Wiley, 2002.


Statistical Summaries

Representative of a Sample: Measures of Central Tendency Summaries

How do you describe the"average" or"typical" piece of information in a set of data? Different procedures are used to summarize the most representative information depending of the type of question asked and the nature of the data being summarized.

Measures of location give information about the location of the central tendency within a group of numbers. The measures of location presented in this unit for ungrouped (raw) data are the mean, the median, and the mode.

Mean: The arithmetic mean (or the average, simple mean) is computed by summing all numbers in an array of numbers (xi) and then dividing by the number of observations (n) in the array.

Mean = = S Xi /n,     the sum is over all i's.

The mean uses all of the observations, and each observation affects the mean. Even though the mean is sensitive to extreme values; i.e., extremely large or small data can cause the mean to be pulled toward the extreme data; it is still the most widely used measure of location. This is due to the fact that the mean has valuable mathematical properties that make it convenient for use with inferential statistical analysis. For example, the sum of the deviations of the numbers in a set of data from the mean is zero, and the sum of the squared deviations of the numbers in a set of data from the mean is the minimum value.

You might like to use Descriptive Statistics to compute the mean.

Weighted Mean: In some cases, the data in the sample or population should not be weighted equally, rather each value should be weighted according to its importance.

Median: The median is the middle value in an ordered array of observations. If there is an even number of observations in the array, the median is the average of the two middle numbers. If there is an odd number of data in the array, the median is the middle number.

The median is often used to summarize the distribution of an outcome. If the distribution is skewed, the median and the interquartile range (IQR) may be better than other measures to indicate where the observed data are concentrated.

Generally, the median provides a better measure of location than the mean when there are some extremely large or small observations; i.e., when the data are skewed to the right or to the left. For this reason, median income is used as the measure of location for the U.S. household income. Note that if the median is less than the mean, the data set is skewed to the right. If the median is greater than the mean, the data set is skewed to the left. For normal population, the sample median is distributed normally with m = the mean, and standard error of the median (p/2)½ times standard error of the mean.

The mean has two distinct advantages over the median. It is more stable, and one can compute the mean based of two samples by combining the two means.

Mode: The mode is the most frequently occurring value in a set of observations. Why use the mode? The classic example is the shirt/shoe manufacturer who wants to decide what sizes to introduce. Data may have two modes. In this case, we say the data are bimodal, and sets of observations with more than two modes are referred to as multimodal. Note that the mode is not a helpful measure of location, because there can be more than one mode or even no mode.

When the mean and the median are known, it is possible to estimate the mode for the unimodal distribution using the other two averages as follows:

Mode » 3(median) - 2(mean)

This estimate is applicable to both grouped and ungrouped data sets.

Whenever, more than one mode exist, then the population from which the sample came is a mixture of more than one population, as shown, for example in the following bimodal histogram.


Click on the image to enlarge it and THEN print it.
A Mixture of Two Different Populations

However, notice that a Uniform distribution has uncountable number of modes having equal density value; therefore it is considered as a homogeneous population.

Almost all standard statistical analyses are conditioned on the assumption that the population is homogeneous.

Notice that Excel has very limited statistical capability. For example, it displays only one mode, the first one. Unfortunately, this is very misleading. However, you may find out if there are others by inspection only, as follow: Create a frequency distribution, invoke the menu sequence: Tools, Data analysis, Frequency and follow instructions on the screen. You will see the frequency distribution and then find the mode visually. Unfortunately, Excel does not draw a Stem and Leaf diagram. All commercial off-the-shelf software, such as SAS and SPSS, display a Stem and Leaf diagram, which is a frequency distribution of a given data set.

Selecting Among the Mode, Median, and Mean

It is a common mistake to specify the wrong index for central tenancy.


Click on the image to enlarge it and THEN print it.
Selecting Among the Mode, Median, and Mean

The first consideration is the type of data, if the variable is categorical, the mode is the single measure that best describes that data.

The second consideration in selecting the index is to ask whether the total of all observations is of any interest. If the answer is yes, then the mean is the proper index of central tendency.

If the total is of no interest, then depending on whether the histogram is symmetric or skewed one must use either mean or median, respectively.

In all cases the histogram must be unimodal. However, notice that, e.g., a Uniform distribution has uncountable number of modes having equal density value; therefore it is considered as a homogeneous population.

Notice also that:

|Mean - Median| £s

The main characteristics of these three statistics are tabulated below:

The Main Characteristics of the Mode, the Median, and the Mean
Fact No.The ModeThe MedianThe Mean
1It is the most frequent value in the distribution; it is the point of greatest density. It is the value of the middle point of the array (not midpoint of range), such that half the item are above and half below it. It is the value in a given aggregate which would obtain if all the values were equal.
2 The value of the mode is established by the predominant frequency, not by the value in the distribution. The value of the media is fixed by its position in the array and doesn't reflect the individual value. The sum of deviations on either side of the mean are equal; hence, the algebraic sum of the deviation is equal zero.
3 It is the most probable value, hence the most typical. The aggregate distance between the median point and all the value in the array is less than from any other point. It reflect the magnitude of every value.
4 A distribution may have 2 or more modes. On the other hand, there is no mode in a rectangular distribution. Each array has one and only one median. An array has one and only one mean.
5 The mode does nott reflect the degree of modality. It cannot be manipulated algebraically: medians of subgroups cannot be weighted and combined. Means may be manipulated algebraically: means of subgroups may be combined when properly weighted.
6 It cannot be manipulated algebraically: modes of subgroups cannot be combined. It is stable in that grouping procedures do not affect it appreciably. It may be calculated even when individual values are unknown, provided the sum of the values and the sample size n are known.
7 It is unstable that it is influenced by grouping procedures. Value must be ordered, and may be grouped, for computation. Values need not be ordered or grouped for this calculation.
8 Values must be ordered and group for its computation. It can be compute when ends are open It cannot be calculated from a frequency table when ends are open.
9 It can be calculated when table ends are open. It is not applicable to qualitative data. It is stable in that grouping procedures do not seriously affected it.

The Descriptive Statistics JavaScript provides a complete set of information about all statistics that you ever need. You might like to use it to perform some numerical experimentation for validating the above assertions for a deeper understanding.


Specialized Averages: The Geometric & Harmonic Means

The Geometric Mean: The geometric mean (G) of n non-negative numerical values is the nth root of the product of the n values.

If some values are very large in magnitude and others are small, then the geometric mean is a better representative of the data than the simple average. In a"geometric series", the most meaningful average is the geometric mean (G). The arithmetic mean is very biased toward the larger numbers in the series.

An Application: Suppose sales of a certain item increase to 110% in the first year and to 150% of that in the second year. For simplicity, assume you sold 100 items initially. Then the number sold in the first year is 110 and the number sold in the second is 150% x 110 = 165. The arithmetic average of 110% and 150% is 130% so that we would incorrectly estimate that the number sold in the first year is 130 and the number in the second year is 169. The geometric mean of 110% and 150% is G = (1.65)1/2 so that we would correctly estimate that we would sell 100 (G)2 = 165 items in the second year.

The Harmonic Mean:The harmonic mean (H) is another specialized average, which is useful in averaging variables expressed as rate per unit of time, such as mileage per hour, number of units produced per day. The harmonic mean (H) of n non-zero numerical values x(i) is: H = n/[S (1/x(i)].

An Application: Suppose 4 machines in a machine shop are used to produce the same part. However, each of the four machines takes 2.5, 2.0, 1.5, and 6.0 minutes to make one part, respectively. What is the average rate of speed?

The harmonic means is: H = 4/[(1/2.5) + (1/2.0) + 1/(1.5) + (1/6.0)] = 2.31 minutes.

If all machines working for one hour, how many parts will be produced? Since four machines running for one hour represent 240 minutes of operating time, then: 240 / 2.31 = 104 parts will be produced.

The Order Among the Three Means: If all the three means exist, then the Arithmetic Mean is never less than the other two, moreover, the Harmonic Mean is never larger than the other two.

You might like to use The Other Means JavaScript in performing some numerical experimentation for validating the above assertions for a deeper understanding.

Further Reading:
Langley R., Practical Statistics Simply Explained, 1970, Dover Press.


Histogramming: Checking for Homogeneity of Population

A histogram is a graphical presentation of an estimate for the density (for continuous random variables) or probability mass function (for discrete random variables) of the population.

The geometric feature of histogram enables us to find out useful information about the data, such as:

  1. The location of the"center" of the data.
  2. The degree of dispersion.
  3. The extend to which its is skewed, that is, it does not fall off systemically on both side of its peak.
  4. The degree of peakedness. How steeply it rises and falls.

The mode is the most frequently occurring value in a set of observations. Data may have two modes. In this case, we say the data are