Hello All,

I am interested in programming an Excel 2010 spreadsheet to perform an outlier assessment for a given data set using Median Absolute Deviation (MAD). If my data set is 0.913, 0.927, 0.778, 0.912, 1.387, 0.917, 0.920, 0.918, 0.916 and 0.921, how should I program my spreadsheet to calculate MAD? In addition, what is the best means in which to highlight if a value is an outlier? Thanks.

I am interested in programming an Excel 2010 spreadsheet to perform an outlier assessment for a given data set using Median Absolute Deviation (MAD). If my data set is 0.913, 0.927, 0.778, 0.912, 1.387, 0.917, 0.920, 0.918, 0.916 and 0.921, how should I program my spreadsheet to calculate MAD? In addition, what is the best means in which to highlight if a value is an outlier? Thanks.

- Finding Outliers and Obtaining the Data
- MS Excel: Standard Deviation Conditional Calculation
- VBA Function to calculate Mean and Standard Deviation
- Calculate a forecasted increase using a Absolute Ref
- Median of all combinations
- Standard Deviation of Dynamic Range
- Find blank cell, insert formula to find median of above cell range
- VBA Macro help please
- Excel 2007: Functions: Statistical Functions | Quamut: the go to how ...
- Blanking / Discounting zero values in ranges
- Problem with interpolating data
- Avedev
- Detect matching values in very large lists
- Embedded Array Formula
- Technical issue in running simulations in Excel 2007
- finding the middle cell in a column of numbers
- Implementing a Label with certain data in Chart
- How to calculate Mean, Median, Mode and standard deviation in group data?
- Median, Average, and Standard Deviation from large set of data
- Calculating AVERAGE DEVIATION from a SPECIFIED POINT??
- Calculating the Median, Mean, and GeoMean of dynamic range
- Creating a macro to calculate medians for every 18-row range
- Calculate median without zeros based on another column criteria

I need your assistance working on a large data set that I am working with. I am looking for a restrictive way of identifying the outliers in my data. I am currently using the median and an array formula to identify outliers that are two deviations from the Median Absolute Deviation. This gives me both positive and negative outliers. I used an array formula to distinguish between normal data and an outlier. However outlier shows up for both positive data and negative data. How do I distinguish between the two types of outliers?

Secondly, is there a better approach to finding outliers and how do I go about capturing the data.

Any assistance you have would be greatly appreciated.

Thanks.

Can someone please help me with my query? I have attached my file ("STDEV.xls").

I have a large database and would like to calculate standard deviation based on a condition "Yes" or "No".

The attached file is a small selected example of my database.

In cell B10, I am using the following array formula to calculate the standard deviation: "{=STDEV(($D$1:$IV$1="No")*(D2:IV2))}".

Please note that in some instances, the cell in the "Value" row will be blank. Again, I have shown this above.

Please can someone help me to obtain the correct answer of "6.00" and not "6.31" as calculated by the array formula.

What change(s) do I need to make to my array formula?

Many thanks.

The first return I calculated in the following way.

=MMULT(TRANSPOSE(B1:B3);H12:H14)-1

The Standard Deviation is calculated in this way.

=SQRT(MMULT(MMULT(TRANSPOSE(B1:B3);C24:E26);B1:B3))

See attached excel sheet.

It would be very helpful if a VBA script can be written which automatically calculates the 5151 returns and the 5151 standard deviation.

I am new to Excel and i am wanting to know how to Calculatet a Forecasted

Increase using say P1 as an Absolute Reference and calculate the percentage

increase on a monthly income.

This is so the formula could then be replicated across without having to

re-type the information in the Ref Cell

This is my first post.

I have 5 numbers (can be with decimals) in column A (A1:A5). For these

numbers, I want all combinations as well as the median of these

combinations.

I'm pretty far already. However, the median seems to be calculated

wrong.

Who can help me?

Thanks,

Andreas

Sub test() ListCombos Range("A1:A5"), 3, "C:Q1_3er.csv" End Sub Sub ListCombos(r As Range, ByVal m As Long, sFile As String) ' lists the combinations of r choose m to file sFile ' r is a single-column or single-row range Dim ai() As Long Dim i As Long Dim n As Long Dim sOut As String Dim sOutMedian As String Dim iFF As Integer If r Is Nothing Then Exit Sub If r.Rows.Count <> 1 And r.Columns.Count <> 1 Then Exit Sub n = r.Count If m < 1 Then Exit Sub If m > n Then m = n iFF = FreeFile Open sFile For Output As #iFF ReDim ai(1 To m) ai(1) = 0 For i = 2 To m ai(i) = i Next i Do For i = 1 To m - 1 If ai(i) + 1 < ai(i + 1) Then ai(i) = ai(i) + 1 Exit For Else ai(i) = i End If Next i If i = m Then If ai(m) < n Then ai(m) = ai(m) + 1 Else Exit Do End If End If ' catenate and write to file sOut = vbNullString Call Sort(ai) sOutMedian = median(ai) For i = 1 To m sOut = sOut & r(ai(i)).Text & "," Next i Write #iFF, Left(sOut, Len(sOut) - 1), sOutMedian Loop Close #iFF End Sub Sub Sort(Arr() As Long) Dim Temp As Double Dim i As Long Dim j As Long For j = 2 To UBound(Arr) Temp = Arr(j) For i = j - 1 To 1 Step -1 If (Arr(i) <= Temp) Then GoTo 10 Arr(i + 1) = Arr(i) Next i i = 0 10 Arr(i + 1) = Temp Next j End Sub Function median(Arr() As Long) Call Sort(Arr) If UBound(Arr) Mod 2 = 1 Then median = Arr(Int(UBound(Arr) / 2) + 1) Else median = (Arr(UBound(Arr) / 2) + Arr(Int(UBound(Arr) / 2) + 1)) / 2 End If End Function

I have a dynamic range that I need to be able to calculate the standard deviation for. The dynamic range sometimes has blank cells and #N/A cells which I do not want included in the calculation. However, there are sometimes cells with the number 0 in it (instead of blank) which I do want to include. I have come up with an array formula, but the standard deviation it calculates is not correct. The formula is:

=STDEV(IF(ISNUMBER(A3:A20000)*ISNUMBER(A3:A20000),1,0))

*This is an array formula calculated by pressing CTRL+SHIFT+ ENTER

The number it calculates is 0.43, however, this is incorrect. If you just calculate the standard deviation of the actual data points in excel (cells A138:A5055 in this case) : STDEV(A138:A5055), it returns 0.17. This is the correct standard deviation. Could someone please help me with this?

Kind Regards,

Maani

http://www.4shared.com/file/12409801...c74/Book1.html

I am trying to find medians for a series of values (in a column) separated by blank rows where the formula knows to end the median calculation when it runs into the first blank cell--and the median number will be calculated in this blank cell.

For instance, cells O3:O15 are filled in, O16 is blank, I want a code that can find the blank O16 cell, and in that cell calculate the median of O3:O15.

I have about 150,000 rows, and there is not a uniform spacing between the blank cell ranges.

thanks in advance!

Beth

So here´s my mathematical problem:

I´ve to figure out the reproducibility of data from repeated analysis creating the arithmetic mean (Interassay Variability)including the absolute error.

EXCEL peforms three diffrent kinds of analysis for this kind of problem:

- The variance(discord)

- The mean absolute deviation

- The standard deviation

All footing on the basic population (universe).

So I would like to know which is the mathematical correct one and why. Because I tried all three with the same data and got three different errorvalues.

Thanks for reading. Bye.

Mattias

(a biologist lost in the middle of physicians)

Again, if its takking the total means of subgroups a1:a8, a10:a16, a18:a24 and I hide a1:a8, i want my total mean and median lines to update to calculate only visible sheets. I will wait with bated breath with the hope that someone better than I will be able to assist in this request.

Function Description Syntax AVEDEV Returns the average of the absolute deviations of data points from their mean (number1,number2,...) AVERAGE Returns the average of its ... ...

What I have so far is a front sheet of data of numbers of crimes in a given time period (fortnight) arranged in columns according to crime type.

Each column of data (crime series) is then taken into separate and subsequent worksheets as a row using the OFFSET function and the mean, standard deviation etc. calculated. This is then used to update a dynamic statistical process control chart using the latest n (user selectable) periods and where the row is a named series.

I would like the spreadsheet to be as user-friendly (idiot-proof!) as possible and so would like others to only have to input figures on the initial data sheet and have everything else automated. The problem is that where there is no data to drag through from the data sheet to the individual worksheets (i.e. numbers of crimes happening in the future) I end up with a zero as the value, and eventually zeros for the mean/standard deviation. This in turn corrupts the control chart.

Is there a way in which I can copy the various formulae across the various rows, but have the cells show as blank if no data is available, and also have the chart plot the latest n periods of data (also discounting the zeros/blanks)?

Hope this makes sense,

Robert

When 2 datapoints are rejected after each other this no longer goes. When this is the case I want to replace the data point with the polyline value.

I tried to make something, but it runs stuck.

'Now interpolate through CheckedRow = Headerrows + 1 Do While CheckedRow < NumRows 'Set variables checkedKP = Cells(CheckedRow, KPColumn) CheckedValue = Cells(CheckedRow, AfterRejectionEColumn) If CheckedValue <> "Rejected" Then 'Leave as is Counter = 1 ElseIf CheckedValue = "Rejected" Then LastgoodKP = Cells(CheckedRow - 1, KPColumn) LastgoodTop = Cells(CheckedRow - 1, EastColumn) LastGoodRow = CheckedRow - 1 NextgoodKP = Cells(CheckedRow + 1, KPColumn) NextgoodTop = Cells(CheckedRow + 1, AfterRejectionEColumn) Interprecords = (NextgoodKP - LastgoodKP) Topdiff = (NextgoodTop - LastgoodTop) Cells(CheckedRow, AfterRejectionEColumn) = (Counter + 1) * ((Topdiff * (checkedKP - LastgoodKP)) / Interprecords) + LastgoodTop If CheckedValue + 1 = "Rejected" Then Cells(CheckedRow, AfterRejectionEColumn) = Cells(CheckedRow, PolyEColumn) Counter = Counter + 1 End If End If CheckedRow = CheckedRow + Counter ' Move on as many as required to get next non rejected value Loop

Mean Absolute Deviation?

Statistically Challenged in Chicago!

Dianne

I'm trying match values (and set a Yes / No result) from values in two very large lists.

List 1 (approx 170,000 rows) contains the 'Find What' values

List 2 (approx 980,000 rows) contains the 'In What' values

Values in list 1 will be unique in list 2, but not all values in list 1 will appear in list 2.

The values in each list are all 16 character stings

This is a one-off otherwise I'd probably import into a AccessDB

Using formulas is taking an absolute age to calculate using MATCH function, so I'm wondering if coding is the better angle.

Any ideas/insight/methods welcome.

(Using XL07, on Core2Duo)

I am currently writing a macro to facilitate the data analysis process of my experimental data. The next step I would like to take with this macro is using the standard deviation of my data to see if it is plausible to neglect certain outlying data and what results I obtain after doing so. I have developed a logic statement in excel should do just what I desire however the original equation uses relative references which isn't correct so I need to actually enter the formulas from those cells into my formula. The problem that has arisen is that my standard deviation is calculated using an array. Is it possible for me to use an embedded array for just that part of the formula?

Now, my problem is when I try to run the simulation with X number of trials. I defined the rectangle (in terms of number of trials, mean, standard error, median, standard deviation, variance, etc. on the vertical axis), selected the rectangle along with the neighboring (to the right) cells, but when I press "ctrl+w" to run the simulation I keep getting the default Excel exit message "Do you want to save the changes you made to...", since Excel recognizes this action in accordance with its shortcut definition.

The interesting part is that I have another simulation file that works flawlessly. So, when I run the simulations in this (second) file, and then go back to my first file (the one described above) the simulations work fine. But when I completely exit Excel and then open up the first file again, I face the same problem.

I would like to ask if someone would help me get around this unusual issue.

Thank you very much for your assistance

Not sure of the right tiltle here goes

I am looking for a formulas to first find the middle number in a column of numbers eg 1,2,3,4,5 3 is the middle (similar to median) thats where the calculations start...

it then assigns values of minus to the numbers above the middle and plus values to the numbers below the middle

1 -50

2 -50

3 0

4 +50

5 +50

now when it comes to even numbers eg 1,2,3,4,5,6 if i use median it divide 3 & 4 and comes up with 3.5 ........ i want it to recognize 3 and 4 as the middle numbers

and assign plus and minuses above and below the middle numbers

1 -50

2 -50

3 -25

4 +25

5 +50

6 + 50

I do hope you all understand, this has been buggin me for weeks

cheers in advance

I have a data set plotted in Excel.

Is there any way to show automatically the minimum, maximum and median value

of this data set on the chart? (In a form of label or anything similar)

Minimum and maximum values are already part of data set.

Median value is a calculated value and is in another cell(not part of data

set).

Thank You in advance

I am trying to create a worksheet capable of calculating the turnaround processing time for emails in a 14 days time frame.

I am wondering if it is possible to create a dynamic excel worksheet which can calculate Mean, Median, Mode and standard deviation within a range of data as Column B listed below changes every forthnightly.

Sample data below:

Column A Column B

Number of days Email

O – 1 600

2 - 3 700

4 - 5 100

6 - 7 50

7 - 14 30

- First row: 600 emails took a day to complete

- Last row: 30 emails took between 7 - 14 days to complete

1. To find the average processing time

2. To find the median processing time in days

3. What is the standard deviation (in days)

Would greatly appreciate if someone could help.

I apologise if I was unable to articulate the whole scenario clearly.

Many Thanks!

I'm stumped. I have a very large set of data (over 200.000 records in

MSAccess DB) that I'm analyzing using Pivot Tables. Average and Standard

deviation functions are "built-in" so no problem there... now, how can I

calculate the median???

Please help!

Thanks in Advance

I have a set of % score values (e.g. 88%, 94%, 82%, 67%)

I would like to know if there is a relatively simple formula that can be used to calculate the AVERAGE DEVIATION from 100%, as opposed to the AVERAGE DEVIATION from the mean, which is what the excel function =AVEDEV will return.

E.g. The Average Deviation from the mean for the the above set of values is 8.25%, whereas the Average Deviation from 100% is 17.25%

I can create a complex manual formula which sums absolute values of the variation from 100% of each score, and then divides by the count total of the number of scores, however this is flawed for several reasons, including the fact it is very time consuming when used across a large set of scores, and it is corrupted when a non score exists in a cell (some of the 'scores' may be '-', in which case they should be ignored and not included in the calculation of the average deviation from 100% by treating as a 0% score (and a deviation of 100% from 100%)

I hope this makes sense.

Any help would be much appreciated.

James

Ex.

Column B Column G Column H 2 10 5 2 13 9 2 9 2 3 8 2 3 7 3 For Column B = 2

I need to Calculate for the range "G2:H4"

I need the Median, Mean, and Geo Mean values to paste in Columns N-P for each different Station Index. My code only calculates for the first Station ID

Here is the code for what I have so far

VB:Thanks in advance!!Median() Dim r As Long Dim stndx As String Dim i As Long Dim x As Integer Application.Calculation = xlCalculationManual r = Cells(Rows.Count, "A").End(xlUp).Row x = 2 stndx = Cells(r, 2).Value[ATTACH]47030[/ATTACH] For i = r To 8 Step -1 If Cells(i, 2).Value stndx Then Cells(x + 1, 11).Value = Cells(i, 1).Value Cells(x + 1, 12).Value = Cells(i, 2).Value Cells(x + 1, 13).Value = Cells(i, 4).Value Cells(x + 1, 14).Value = Application.Median(Range("G2:H" & i)) Cells(x + 1, 15).Value = Application.Average(Range("G2:H" & i)) End If Next i Application.Calculation = xlCalculationAutomatic End SubIf you like these VB formatting tags please consider sponsoring the author in support of injured Royal Marines

Cameron

I am really new to VBA programming and I now need to calculate medians for my experiment data (over 200 participants). I believe Excel macro can do the job, because I have searched and read some old threads in the forum on calculating median but am still not sure how to create one that is appropriate for my data set.

I have attached a sample data file from one participant. In this file, the first column is the response accuracy (1: correct, 0:incorrect) and the second column is the response time (RT) in millisecond. Every 18 trials (rows) is a block, and there are 10 blocks in the sample data. I want to get the median response time for each block (i.e., every 18 trials) for the correct trials only. And the 10 median response times should then be put in the third column.

Please help. I would really appreciate it!

Thanks,

Sandra

I am trying to create a formula that will calculate the median of the numbers in column B based on a specific criteria in column A and also ony if they are in the one month in column c. but cant work out how to get excell to do this without calculating the balnk cells

eg: i want the median prices for the sales in mosman in april (multiple months in the sheet)

spreadsheet attached

thanks in advance for anyones help

David