Free Microsoft Excel 2013 Quick Reference

Median Absolute Deviation (MAD) Calculation

Hello All,

I am interested in programming an Excel 2010 spreadsheet to perform an outlier assessment for a given data set using Median Absolute Deviation (MAD). If my data set is 0.913, 0.927, 0.778, 0.912, 1.387, 0.917, 0.920, 0.918, 0.916 and 0.921, how should I program my spreadsheet to calculate MAD? In addition, what is the best means in which to highlight if a value is an outlier? Thanks.

Post your answer or comment

comments powered by Disqus
Hi Everyone,

I need your assistance working on a large data set that I am working with. I am looking for a restrictive way of identifying the outliers in my data. I am currently using the median and an array formula to identify outliers that are two deviations from the Median Absolute Deviation. This gives me both positive and negative outliers. I used an array formula to distinguish between normal data and an outlier. However outlier shows up for both positive data and negative data. How do I distinguish between the two types of outliers?
Secondly, is there a better approach to finding outliers and how do I go about capturing the data.
Any assistance you have would be greatly appreciated.



Can someone please help me with my query? I have attached my file ("STDEV.xls").

I have a large database and would like to calculate standard deviation based on a condition "Yes" or "No".

The attached file is a small selected example of my database.

In cell B10, I am using the following array formula to calculate the standard deviation: "{=STDEV(($D$1:$IV$1="No")*(D2:IV2))}".

Please note that in some instances, the cell in the "Value" row will be blank. Again, I have shown this above.

Please can someone help me to obtain the correct answer of "6.00" and not "6.31" as calculated by the array formula.

What change(s) do I need to make to my array formula?

Many thanks.

In 3 rows (B1:B3) and in 5151 columns (till column GPD) I got alle the combinations which sum up to 1. (Thanx for the help in an other thread).

The first return I calculated in the following way.

The Standard Deviation is calculated in this way.

See attached excel sheet.

It would be very helpful if a VBA script can be written which automatically calculates the 5151 returns and the 5151 standard deviation.

I am new to Excel and i am wanting to know how to Calculatet a Forecasted
Increase using say P1 as an Absolute Reference and calculate the percentage
increase on a monthly income.
This is so the formula could then be replicated across without having to
re-type the information in the Ref Cell

Hi all,

This is my first post.

I have 5 numbers (can be with decimals) in column A (A1:A5). For these
numbers, I want all combinations as well as the median of these

I'm pretty far already. However, the median seems to be calculated

Who can help me?


Sub test()
    ListCombos Range("A1:A5"), 3, "C:Q1_3er.csv"
End Sub

Sub ListCombos(r As Range, ByVal m As Long, sFile As String)
     ' lists the combinations of r choose m to file sFile
     ' r is a single-column or single-row range
    Dim ai()    As Long
    Dim i       As Long
    Dim n       As Long
    Dim sOut    As String
    Dim sOutMedian    As String

    Dim iFF     As Integer

    If r Is Nothing Then Exit Sub
    If r.Rows.Count <> 1 And r.Columns.Count <> 1 Then Exit Sub

    n = r.Count
    If m < 1 Then Exit Sub
    If m > n Then m = n

    iFF = FreeFile
    Open sFile For Output As #iFF

    ReDim ai(1 To m)

    ai(1) = 0
    For i = 2 To m
        ai(i) = i
    Next i

        For i = 1 To m - 1
            If ai(i) + 1 < ai(i + 1) Then
                ai(i) = ai(i) + 1
                Exit For
                ai(i) = i
            End If
        Next i
        If i = m Then
            If ai(m) < n Then
                ai(m) = ai(m) + 1
                Exit Do
            End If
        End If

         ' catenate and write to file
        sOut = vbNullString
        Call Sort(ai)
        sOutMedian = median(ai)

        For i = 1 To m
             sOut = sOut & r(ai(i)).Text & ","
        Next i
        Write #iFF, Left(sOut, Len(sOut) - 1), sOutMedian

    Close #iFF
End Sub

 Sub Sort(Arr() As Long)

    Dim Temp As Double
    Dim i As Long
    Dim j As Long

    For j = 2 To UBound(Arr)
        Temp = Arr(j)
        For i = j - 1 To 1 Step -1
            If (Arr(i) <= Temp) Then GoTo 10
            Arr(i + 1) = Arr(i)
        Next i
        i = 0
10      Arr(i + 1) = Temp
    Next j

End Sub
Function median(Arr() As Long)
    Call Sort(Arr)

    If UBound(Arr) Mod 2 = 1 Then
        median = Arr(Int(UBound(Arr) / 2) + 1)
        median = (Arr(UBound(Arr) / 2) + Arr(Int(UBound(Arr) / 2) +
1)) / 2
    End If
End Function


I have a dynamic range that I need to be able to calculate the standard deviation for. The dynamic range sometimes has blank cells and #N/A cells which I do not want included in the calculation. However, there are sometimes cells with the number 0 in it (instead of blank) which I do want to include. I have come up with an array formula, but the standard deviation it calculates is not correct. The formula is:


*This is an array formula calculated by pressing CTRL+SHIFT+ ENTER

The number it calculates is 0.43, however, this is incorrect. If you just calculate the standard deviation of the actual data points in excel (cells A138:A5055 in this case) : STDEV(A138:A5055), it returns 0.17. This is the correct standard deviation. Could someone please help me with this?

Kind Regards,



I am trying to find medians for a series of values (in a column) separated by blank rows where the formula knows to end the median calculation when it runs into the first blank cell--and the median number will be calculated in this blank cell.

For instance, cells O3:O15 are filled in, O16 is blank, I want a code that can find the blank O16 cell, and in that cell calculate the median of O3:O15.

I have about 150,000 rows, and there is not a uniform spacing between the blank cell ranges.

thanks in advance!


Hello all together!

So here´s my mathematical problem:
I´ve to figure out the reproducibility of data from repeated analysis creating the arithmetic mean (Interassay Variability)including the absolute error.
EXCEL peforms three diffrent kinds of analysis for this kind of problem:

- The variance(discord)
- The mean absolute deviation
- The standard deviation

All footing on the basic population (universe).
So I would like to know which is the mathematical correct one and why. Because I tried all three with the same data and got three different errorvalues.

Thanks for reading. Bye.
(a biologist lost in the middle of physicians)

Hi all, this is my first post to this thread. I need help with an excel spreadsheet and it seems like this is the place to be. I have a set of financials that looks at companies in different subgroups. Each subgroup has its own mean and median calcualtion. After the last subgroup are two lines which have the total means and medians of all the subgroups. I am wondering if there is a macro available so that if I hid one subgroup, I could have the total mean and median update to only calculate the visible rows. Can this be done with a macro or formula.

Again, if its takking the total means of subgroups a1:a8, a10:a16, a18:a24 and I hide a1:a8, i want my total mean and median lines to update to calculate only visible sheets. I will wait with bated breath with the hope that someone better than I will be able to assist in this request.

Excel 2007: Functions: Statistical Functions | Quamut: the go to how ...

Function Description Syntax AVEDEV Returns the average of the absolute deviations of data points from their mean (number1,number2,...) AVERAGE Returns the average of its ... ...

Thanks for all the help I've had so far from the forums - one more question!

What I have so far is a front sheet of data of numbers of crimes in a given time period (fortnight) arranged in columns according to crime type.

Each column of data (crime series) is then taken into separate and subsequent worksheets as a row using the OFFSET function and the mean, standard deviation etc. calculated. This is then used to update a dynamic statistical process control chart using the latest n (user selectable) periods and where the row is a named series.

I would like the spreadsheet to be as user-friendly (idiot-proof!) as possible and so would like others to only have to input figures on the initial data sheet and have everything else automated. The problem is that where there is no data to drag through from the data sheet to the individual worksheets (i.e. numbers of crimes happening in the future) I end up with a zero as the value, and eventually zeros for the mean/standard deviation. This in turn corrupts the control chart.

Is there a way in which I can copy the various formulae across the various rows, but have the cells show as blank if no data is available, and also have the chart plot the latest n periods of data (also discounting the zeros/blanks)?

Hope this makes sense,


I have a problem with interpolating. When the residuel of a data point (polyline-data point) is 2x standard deviation (allready calculated), then I want reject the data point and it appears red, "rejected" and thick. Then I want to interpolate with the previus and next datapoint.

When 2 datapoints are rejected after each other this no longer goes. When this is the case I want to replace the data point with the polyline value.

I tried to make something, but it runs stuck.

'Now interpolate through

CheckedRow = Headerrows + 1

Do While CheckedRow < NumRows

    'Set variables
    checkedKP = Cells(CheckedRow, KPColumn)
    CheckedValue = Cells(CheckedRow, AfterRejectionEColumn)
    If CheckedValue <> "Rejected" Then
        'Leave as is
        Counter = 1
    ElseIf CheckedValue = "Rejected" Then
        LastgoodKP = Cells(CheckedRow - 1, KPColumn)
        LastgoodTop = Cells(CheckedRow - 1, EastColumn)
        LastGoodRow = CheckedRow - 1
        NextgoodKP = Cells(CheckedRow + 1, KPColumn)
        NextgoodTop = Cells(CheckedRow + 1, AfterRejectionEColumn)
        Interprecords = (NextgoodKP - LastgoodKP)
        Topdiff = (NextgoodTop - LastgoodTop)
        Cells(CheckedRow, AfterRejectionEColumn) = (Counter + 1) * ((Topdiff * (checkedKP - LastgoodKP)) / Interprecords) +
       If CheckedValue + 1 = "Rejected" Then
        Cells(CheckedRow, AfterRejectionEColumn) = Cells(CheckedRow, PolyEColumn)
        Counter = Counter + 1
    End If
    End If
    CheckedRow = CheckedRow + Counter ' Move on as many as required to get next non rejected value

I just wanted to cofirm . . .will the Avedev give me the same results as a
Mean Absolute Deviation?

Statistically Challenged in Chicago!



I'm trying match values (and set a Yes / No result) from values in two very large lists.

List 1 (approx 170,000 rows) contains the 'Find What' values
List 2 (approx 980,000 rows) contains the 'In What' values

Values in list 1 will be unique in list 2, but not all values in list 1 will appear in list 2.

The values in each list are all 16 character stings

This is a one-off otherwise I'd probably import into a AccessDB

Using formulas is taking an absolute age to calculate using MATCH function, so I'm wondering if coding is the better angle.

Any ideas/insight/methods welcome.

(Using XL07, on Core2Duo)

Hey all,

I am currently writing a macro to facilitate the data analysis process of my experimental data. The next step I would like to take with this macro is using the standard deviation of my data to see if it is plausible to neglect certain outlying data and what results I obtain after doing so. I have developed a logic statement in excel should do just what I desire however the original equation uses relative references which isn't correct so I need to actually enter the formulas from those cells into my formula. The problem that has arisen is that my standard deviation is calculated using an array. Is it possible for me to use an embedded array for just that part of the formula?

I am trying to run a Monte Carlo simulation in using random number generator function (RAND()) with X number of trials. I have specified the constraints of the RAND function so every time I press F9 a random number is generated.

Now, my problem is when I try to run the simulation with X number of trials. I defined the rectangle (in terms of number of trials, mean, standard error, median, standard deviation, variance, etc. on the vertical axis), selected the rectangle along with the neighboring (to the right) cells, but when I press "ctrl+w" to run the simulation I keep getting the default Excel exit message "Do you want to save the changes you made to...", since Excel recognizes this action in accordance with its shortcut definition.
The interesting part is that I have another simulation file that works flawlessly. So, when I run the simulations in this (second) file, and then go back to my first file (the one described above) the simulations work fine. But when I completely exit Excel and then open up the first file again, I face the same problem.

I would like to ask if someone would help me get around this unusual issue.

Thank you very much for your assistance

Hi Guys,

Not sure of the right tiltle here goes

I am looking for a formulas to first find the middle number in a column of numbers eg 1,2,3,4,5 3 is the middle (similar to median) thats where the calculations start...

it then assigns values of minus to the numbers above the middle and plus values to the numbers below the middle
1 -50
2 -50
3 0
4 +50
5 +50

now when it comes to even numbers eg 1,2,3,4,5,6 if i use median it divide 3 & 4 and comes up with 3.5 ........ i want it to recognize 3 and 4 as the middle numbers
and assign plus and minuses above and below the middle numbers
1 -50
2 -50
3 -25
4 +25
5 +50
6 + 50

I do hope you all understand, this has been buggin me for weeks

cheers in advance

Dear All !

I have a data set plotted in Excel.

Is there any way to show automatically the minimum, maximum and median value
of this data set on the chart? (In a form of label or anything similar)
Minimum and maximum values are already part of data set.
Median value is a calculated value and is in another cell(not part of data

Thank You in advance


I am trying to create a worksheet capable of calculating the turnaround processing time for emails in a 14 days time frame.

I am wondering if it is possible to create a dynamic excel worksheet which can calculate Mean, Median, Mode and standard deviation within a range of data as Column B listed below changes every forthnightly.

Sample data below:

Column A Column B
Number of days Email
O – 1 600
2 - 3 700
4 - 5 100
6 - 7 50
7 - 14 30

- First row: 600 emails took a day to complete
- Last row: 30 emails took between 7 - 14 days to complete

1. To find the average processing time
2. To find the median processing time in days
3. What is the standard deviation (in days)

Would greatly appreciate if someone could help.

I apologise if I was unable to articulate the whole scenario clearly.

Many Thanks!

Hi all,

I'm stumped. I have a very large set of data (over 200.000 records in
MSAccess DB) that I'm analyzing using Pivot Tables. Average and Standard
deviation functions are "built-in" so no problem there... now, how can I
calculate the median???

Please help!
Thanks in Advance

Hi People,

I have a set of % score values (e.g. 88%, 94%, 82%, 67%)

I would like to know if there is a relatively simple formula that can be used to calculate the AVERAGE DEVIATION from 100%, as opposed to the AVERAGE DEVIATION from the mean, which is what the excel function =AVEDEV will return.

E.g. The Average Deviation from the mean for the the above set of values is 8.25%, whereas the Average Deviation from 100% is 17.25%

I can create a complex manual formula which sums absolute values of the variation from 100% of each score, and then divides by the count total of the number of scores, however this is flawed for several reasons, including the fact it is very time consuming when used across a large set of scores, and it is corrupted when a non score exists in a cell (some of the 'scores' may be '-', in which case they should be ignored and not included in the calculation of the average deviation from 100% by treating as a 0% score (and a deviation of 100% from 100%)

I hope this makes sense.

Any help would be much appreciated.


I have a spreadsheet with Data in Columns A-H. Column B is an ID value that will repeat an unknown amount of times. For each Value in Column B I need to calculate the Median, Mean, and GeoMean for the corresponding range of "G_:H_"

Column B Column G Column H 2 10 5 2 13 9 2 9 2 3 8 2 3 7 3 For Column B = 2
I need to Calculate for the range "G2:H4"

I need the Median, Mean, and Geo Mean values to paste in Columns N-P for each different Station Index. My code only calculates for the first Station ID

Here is the code for what I have so far

    Dim r As Long 
    Dim stndx As String 
    Dim i As Long 
    Dim x As Integer 
    Application.Calculation = xlCalculationManual 
    r = Cells(Rows.Count, "A").End(xlUp).Row 
    x = 2 
    stndx = Cells(r, 2).Value[ATTACH]47030[/ATTACH] 
    For i = r To 8 Step -1 
        If Cells(i, 2).Value  stndx Then 
            Cells(x + 1, 11).Value = Cells(i, 1).Value 
            Cells(x + 1, 12).Value = Cells(i, 2).Value 
            Cells(x + 1, 13).Value = Cells(i, 4).Value 
            Cells(x + 1, 14).Value = Application.Median(Range("G2:H" & i)) 
            Cells(x + 1, 15).Value = Application.Average(Range("G2:H" & i)) 
        End If 
    Next i 
    Application.Calculation = xlCalculationAutomatic 
End Sub 

If you like these VB formatting tags please consider sponsoring the author in support of injured Royal Marines
Thanks in advance!!


I am really new to VBA programming and I now need to calculate medians for my experiment data (over 200 participants). I believe Excel macro can do the job, because I have searched and read some old threads in the forum on calculating median but am still not sure how to create one that is appropriate for my data set.

I have attached a sample data file from one participant. In this file, the first column is the response accuracy (1: correct, 0:incorrect) and the second column is the response time (RT) in millisecond. Every 18 trials (rows) is a block, and there are 10 blocks in the sample data. I want to get the median response time for each block (i.e., every 18 trials) for the correct trials only. And the 10 median response times should then be put in the third column.

Please help. I would really appreciate it!


I am trying to create a formula that will calculate the median of the numbers in column B based on a specific criteria in column A and also ony if they are in the one month in column c. but cant work out how to get excell to do this without calculating the balnk cells
eg: i want the median prices for the sales in mosman in april (multiple months in the sheet)
spreadsheet attached
thanks in advance for anyones help

No luck finding an answer? You could always try Google.