Free Microsoft Excel 2013 Quick Reference

GINI Coefficient Formula

Hello all, does anyone have any ideas on how the formula for the GINI coefficient could be expressed in Excel ΣiΣj(yi-yj)

In the formula, y refers to income and both i and j refer to households. Thus for instance if there were two households then to get ΣiΣj(yi-yj) first set i equal to one then do the single sum across the js. Then set i = 2 and do the single sum across the js. Sum the two sums, and you have it. This is equivalent to:

ΣiΣj(yi-yj) = |y1-y1| + |y1-y2| + |y2-y1| + |y2-y2|

I represents the total number of households (which is why you sum from i = 1 to I and also from j = 1 to I).

In my case I have 100 households.

Any help would be much appreciated.


All

I want to calculate the Gini coefficient of a series with the following
macro. Excel reports a type mismatch for sum2 which I don't understand.

Dirk

Sub gini()

Dim ran As Range
Dim sum1, sum2, sum3, sum4 As Variant
Dim i, num As Integer

'ask for the range of the variable
Set ran = Application.InputBox("type in the range of the variable",
Type:=8)
num = ran.Rows.Count

'copy and paste into column to the right, sort
Selection.Copy
ran.Offset(0, 1).Select
Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone,
SkipBlanks _
:=False, Transpose:=False
Application.CutCopyMode = False
Selection.Sort Key1:=ran.Offset(0, 1), Order1:=xlAscending,
Header:=xlGuess, _
OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
DataOption1:=xlSortNormal

'cumulative income
'=SUM($B$2:B2)/SUM($B$2:$B$49)
sum1 = 0
For i = 0 To num - 1
sum1 = sum1 + ran.Cells(i, 1)
Next i
sum2 = 0
For i = 0 To num - 1
sum2 = sum2 + ran.Offset(i, 1)
ran.Offset(i, 2) = sum2 / sum1
Next i

'cumulative population
'=ROWS($C$2:C2)/ROWS($C$2:$C$49)
sum3 = 0
For i = 0 To num - 1
sum3 = sum3 + ran.Offset(i, 1)
ran.Offset(i, 3) = sum2 / num
Next i

'the product in the gini equation
'=(D3-D2)*(C3+C2)
sum4 = 0
For i = 1 To num
ran.Offset(i, 4) = (ran.Offset(i, 3) - ran.Offset(i - 1, 3)) *
(ran.Offset(i, 2) _
+ ran.Offset(i - 1, 2))
sum4 = sum4 + ran.Offset(i, 4)
Next i

'calculate coefficient
With ran.Offset(-1, 4)
..Value = "Gini"
..Font.Bold = True
End With
With ran.Offset(0, 4)
..Value = Abs(1 - sum4)
..Font.Color = 1
End With

End Sub

Hi,

I'm trying to write a custom function to calculate gini coefficients.
I've been able to use this function when inputted manually:

(Where X is a range)
=SUM(ABS(X-TRANSPOSE(X)))/(2*AVERAGE(X)*((COUNT(X))*(COUNT(X))))

....entered as an array function.

What I am interested in creating is a custom function in Visual Basic.
So far, I've gotten this far:

Function GiniCalculator(Range)

GiniCalculator = WorksheetFunction.Sum(Math.Abs(Range -
WorksheetFunction.Transpose(Range))) / (2 *
WorksheetFunction.Average(Range) * ((WorksheetFunction.Count(Range)) *
(WorksheetFunction.Count(Range))))

End Function

But when I use this function, I just arrive at the #VALUE! error
message. I'm not sure why, but it might has something to do with the
fact the above function is an array function. Does anybody have any
thoughts on how to write a custom function for gini coefficients so it
is not necessary to manually input the array address each time?

I am currently trying to use Excel to derive a G-segregation index, which is a form of the standard Gini coefficient. (see attached)

I think this is the correct Excel formula for the numerator (as an array function), and it gives somehting which looks like the right result, but just wanted to check.

{=SUM(Y*TRANSPOSE(Y)*ABS(X-TRANSPOSE(X))}

Does anybody know if this is correct? What would the above look like as a standard summation if not?

Is there another way of calculating this?

[I would attach the original academic paper, bt don't want to infringe copyright - J.Silber - On the Measurement of Employment Segregation (1989)]

I am currently trying to use Excel to derive a G-segregation index, which is a form of the standard Gini coefficient.

The non-matrix form and matrix form equation for what I am trying to calculate is in the attached

I think this is the correct Excel formula for the numerator (as an array function), and it gives somehting which looks like the right result, but just wanted to check.

{=SUM(Y*TRANSPOSE(Y)*ABS(X-TRANSPOSE(X))}

Does anybody know if this is correct? Is there another way of calculating this?

[I would attach the original academic paper, bt don't want to infringe copyright - J.Silber - On the Measurement of Employment Segregation (1989)]

Hi all,

I have written an UDF to calculate the Gini Co-efficient.

The Gini coefficient measures inequalities, in other words, the deviation from a uniform distribution. The data is arrangedfrom the smallest to the highest and accumulated to a so-called Lorenz curve. The Gini coefficient is then calculated by dividing the area between the perfect diagonal and the Lorenz curve by the area between the diagonal and the x-axis. The coefficient can lie between zero and one. The higher the coefficient, the more unequal the distribution of the data.

The steps for calclating the co-efficient are as follows:

Tabelle1

ABCDEFGHIJKL2Step A:
STEP B:

Step C:
Step D:
Step E:
Step F:3Unsorted Data
Count Range Rows, Produce i
Sort Data Ascending
Calculate Fractiles
Cumulate Fractiles
Calculate Gini Coefficient41014
0,083333333=1/12
714
0,06367041
0,06367041
0,10365025714
0,166666667=2/12
731
0,06518637
0,12885679

6779
0,25…
754
0,06723738
0,19609417

7863
0,333333333…
779
0,06946674
0,26556091

8877
0,416666667…
863
0,07695737
0,34251828

9731
0,5…
877
0,07820581
0,42072409

10942
0,583333333…
942
0,08400214
0,50472624

111176
0,666666667…
1014
0,09042269
0,59514892

12754
0,75…
1046
0,09327626
0,68842518

131069
0,833333333…
1069
0,09532727
0,78375245

141046
0,916666667=11/12
1176
0,10486891
0,88862137

151249
1=12/12
1249
0,11137863
1

16

6,5= i
11214

5,8780988

Formeln der TabelleZelleFormelC4=1/COUNT(A4:A15)H4=F4/$F$16J4=H4L4=2/COUNT(A4:A15) *(C16-J16)C5=1/COUNT($A$4:$A$15) + C4H5=F5/$F$16J5=H5+J4C16=SUM(C4:C15)F16=SUM(F4:F15)J16=SUM(J4:J15)

Excel Tabellen im Web darstellen >> Excel Jeanie HTML 4

The UDF I have written is:

Code:
Function Gini(rng As Range)

Dim Cll As Range
Dim RW_cnt
Dim i(1 To 1048575), i_write, i_sum
Dim Cll_val(0 To 1048575), Cll_val_write, Cll_val_temp, a
Dim Frac_val(1 To 1048575), Frac_val_write
Dim Cum_Frac_val(0 To 1048575), Cum_Frac_val_sum

'Step B:

RW_cnt = rng.Rows.Count

For i_write = 1 To RW_cnt
    
    i(i_write) = i_write / RW_cnt 'build fractions
    i_sum = i_sum + i(i_write) 'cumulate fractions
    
Next i_write

'Step C:
For Each Cll In rng
    Cll_val_write = Cll_val_write + 1
    Cll_val(Cll_val_write) = Cll.Value 'read cell values in Range
    
    ' Sort Cell values in ascending order
    If Cll_val_write > 1 Then
        a = Cll_val_write - 1
        Do While Cll_val(a + 1) < Cll_val(a)
            If a >= 1 Then
                Cll_val_temp = Cll_val(a)
                Cll_val(a) = Cll_val(a + 1)
                Cll_val(a + 1) = Cll_val_temp
                a = a - 1
            End If
        Loop
    End If
    
Next Cll

For Frac_val_write = 1 To RW_cnt
    
    'Step D:
    Frac_val(Frac_val_write) = (Cll_val(Frac_val_write) / WorksheetFunction.Sum(rng)) 'calculate fractiles
    
    Cum_Frac_val(Frac_val_write) = Cum_Frac_val(Frac_val_write - 1) + Frac_val(Frac_val_write)  'cumulate fractions
    
    'Step E:
    Cum_Frac_val_sum = Cum_Frac_val_sum + Cum_Frac_val(Frac_val_write)
    
Next Frac_val_write

' Step F:
Gini = 2 / RW_cnt * (i_sum - Cum_Frac_val_sum)

End Function
It works, but I am very sure that this is unelegant, too complicated and not robust enough.
I would love to learn about working with arrays and think this is the perfect opportunity to do so.
Furthermore I think that it would need some error trapping, for instance for:
- non-continous ranges (blank cells within the data)
- Text in the ranges.
-...

What might be beneficial as well, if the function contained an option to choose whether values of 0 are part of the distribution or not like this:

Code:
I'm looking forward to your comments and ideas.

Stephan

Dear all

I am using a UDF for calculation Gini coefficient.
Udf is working absolutely fine, but i would like the range being used by UDF be defined through a user input box.
Any ideas will be greatly appreciated

Attached: Gini.xlsm

This question is related to my post: UDF for Gini Coefficient.

To make the question simple:

At the moment I am reading values into a variable array and sort the values ascending immediately after writing a new value to the array with the following code:

Code:
 1 Then
                a = Cll_val_write - 1
                Do While Frac_val(a + 1) < Frac_val(a)
                    If a >= 1 Then
                        Frac_val_temp = Frac_val(a)
                        Frac_val(a) = Frac_val(a + 1)
                        Frac_val(a + 1) = Frac_val_temp
                        a = a - 1
                    End If
                Loop
            End If
I was thinking of first reading ALL values into the array and sorting it after.
I just don't know if that would be more efficient and how to go about it.
The important points about the array are:

Frac_val(0) is always empty ( Frac_val(0) = "" )
The size of the array is set to equal the number of cells in the range. However, Empty cells are ignored and not written into the array.
Each part of the array therefore contains a number, but the defined size of the array may be larger than the number of array elements that contain data.

Example - 20 Cells in Range -> Array size is Frac_val(0 to 20);

5 Cells in the range do not contain data and are ignored, so the array Structure would be:
Frac_val(0) = "",
Frac_val(1) to Frac_val(15) contain Data,
Frac_val(16) to Frac_val(20) = "".

I would only need to sort Frac_val(1) to Frac_val(15), ignoring Frac_val(16) to Frac_val(20) and possibly resizing the array to Frac_val(0 to 15).

Any help and suggestions would be appriciated.

I must apologize in advance for the length. The question is specific and I have tried to be concise while including all necessary information.

I am attempting to do exponential regression and am finding that using Linest gives values close to what I need, but when I double-check them by charting the same data and applying a trendline the numbers are slightly off, enough that I cannot accept the difference. I believe this may be because my x-values are nonlinear.

Question: Why are my calculations always apparently close, but not correct?

For illustration, assume that my x-values, represented by $A$4:$A$11 are {0, 24, 48, 96, 144, 192, 288, 384} and my y-values, represented by $B$4:$B$11, are {0.5, 0.51, 0.51, 0.65, 0.65, 0.51, 0.65, 0.65}.

A stripped version of the program showing exactly this is attached as "Testbook1".

The 'coefficients' formula that is inputted within the cell will look like this:
Ex. "=EXP(INDEX(LINEST(LN($B$4:$B$11), LN($A$4:$A$11),TRUE,FALSE), 1, 2 ) )"

The 'power' formula may look like this:
Ex. "=LINEST(LN($B$4:$B$11), LN($A$4:$A$11),, TRUE)"

The "R^2" formula may look like this:
Ex. "=INDEX(LINEST(LN($B$4:$B$11), LN($A$4:$A$11),, TRUE), 3, 1)"
The example for 'coefficient' above gives a value of 0.51172862. The trendline option in excel gives a coefficient of 0.5227.

The example for 'power' above gives a value of 0.029640993. The trendline option in excel gives a power of 0.0006.

The example for 'R^2' above gives a value of 0.355785557. The trendline option in excel gives an R^2 of 0.4251.

My next two posts will have progressively more detail for anyone who is interested. If this is enough, I recommend not wasting the time to read them! Auto Merged Post Until 24 Hrs Passes;

Note: I have read this thread: http://www.ozgrid.com/forum/showthread.php?t=85286, which seemed to have a very similar problem. However, reading the posted link (http://support.microsoft.com/kb/211967) that helped the author has not solved the problem.

Here is the long version of the above post:

I must apologize in advance for the length. The question is specific and I have tried to be concise while including all necessary information.

I am attempting to do exponential regression and am finding that using Linest gives values close to what I need, but when I double-check them by charting the same data and applying a trendline the numbers are slightly off, enough that I cannot accept the difference. I believe this may be because my x-values are nonlinear (e.g. 0, 24, 48, 96, 144, 192, 288, 384 as opposed to 1, 2, 3, 4, 5...)..

Short program description
I am analyzing color density in digital photos and calculating the time until the photos 'fail'. Color samples are divided by ink, color, shade, and point in time the reading is taken. I need to find the shades whose first 'points' are closest to 3 or so designated target densities (those we wish we had started with in a perfect world). Once I've found those, I interpolate to theorize what those plots would look like

(This is a general - and generally poor - description but these parts are all working and are not necessary to understand the problem.)

In essence, supplied with raw data in text files (Sheet = 'Original') the program copies and reorganizes them into a 4-D array.

The array is sorted and a new 'interpolation' array is created using the old data.

Finally, the interpolation array is printed (on 'Sheet2') and some formulas are inserted below each plot to determine a, b, and R^2 if the plot were fitted (using the least-squares method) to a curve of type:

y = a * exp(b * x)I am using the method used by ozgrid-member SHG in this thread: http://www.ozgrid.com/forum/showthread.php?t=57470
The only difference, as far as I can tell, is that my x-values (representing 'time') are not linear (e.g. 0, 24, 48, 96, 144, 192, 288, 384 as opposed to 1, 2, 3, 4, 5...).

Here is my code snippet for placing the formulas into cells where 'coefficient' refers to 'a' above, 'power' refers to 'b', and R^2 shows the average distance squared (least-squares method). The lines are very long, read below for disambiguation:


	VB:
	
For j = 1 To numColors 
     
    Cells(currentRow, currentColumn).Formula = "=Exp(Index(LinEst(Ln(" & Range(Cells(currentRow - (numPoints + 2),
currentColumn), Cells(currentRow - 3, currentColumn)).Address & "), Ln(" & Range(Cells(currentRow - (numPoints + 2), 1),
Cells(currentRow - 3, 1)).Address & "),,True), 1, 2 ) )" 
    currentColumn = currentColumn + 1 
     
Next 
 
currentColumn = 2 
 
 'Power
For j = 1 To numColors 
     
    Cells(currentRow + 1, currentColumn).Formula = "=LinEst(Ln(" & Range(Cells(currentRow - (numPoints + 2), currentColumn),
Cells(currentRow - 3, currentColumn)).Address & "), Ln(" & Range(Cells(currentRow - (numPoints + 2), 1), Cells(currentRow -
3, 1)).Address & "),, True)" 
    currentColumn = currentColumn + 1 
     
Next 
 
currentColumn = 2 
 
 'R^2
For j = 1 To numColors 
     
    Cells(currentRow + 2, currentColumn).Formula = "=Index(LinEst(Ln(" & Range(Cells(currentRow - (numPoints + 2),
currentColumn), Cells(currentRow - 3, currentColumn)).Address & "), Ln(" & Range(Cells(currentRow - (numPoints + 2), 1),
Cells(currentRow - 3, 1)).Address & "),, TRUE), 3, 1)" 
    currentColumn = currentColumn + 1 
     
Next 
 
currentColumn = 2 

If you like these VB formatting tags please consider sponsoring the author in support of injured Royal Marines
Disambiguation:

The 'coefficients' formula that is inputted within the cell will look like this:
Ex. "=EXP(INDEX(LINEST(LN($B$4:$B$11), LN($A$4:$A$11),TRUE,FALSE), 1, 2 ) )"

The 'power' formula may look like this:
Ex. "=LINEST(LN($B$4:$B$11), LN($A$4:$A$11),, TRUE)"

The "R^2" formula may look like this:
Ex. "=INDEX(LINEST(LN($B$4:$B$11), LN($A$4:$A$11),, TRUE), 3, 1)"
Question: Why are my calculations always apparently close, but not correct?

For illustration, assume that my x-values, represented by $A$4:$A$11 are {0, 24, 48, 96, 144, 192, 288, 384} and my y-values, represented by $B$4:$B$11, are {0.5, 0.51, 0.51, 0.65, 0.65, 0.51, 0.65, 0.65}.

The example for 'coefficient' above gives a value of 0.51172862. The trendline option in excel gives a coefficient of 0.5227.

The example for 'power' above gives a value of 0.029640993. The trendline option in excel gives a power of 0.0006.

The example for 'R^2' above gives a value of 0.355785557. The trendline option in excel gives an R^2 of 0.4251.

My last post will have the full program with directions. Auto Merged Post Until 24 Hrs Passes;

The program is attached as 'Testbook'.

Description:

The sheet 'Original' has an example of the data I am using, exactly the way I get it in the text files.

'Sheet1' has the data from 'Original' cleaned up a bit.

'Sheet2' is where the output is placed. Don't delete it!

The macro is named 'Redline'. (If you can find a modified use for it, you have my full permission to make millions of dollars off it, as long as you don't turn around and sue me for infringing on "your" copyright.) :-)

Redline will open a series of message boxes.

In order, click "OK", "Yes", and "No". In general, don't mess around with the default settings unless you want to spend some serious time figuring out the program.This should spit out some poorly organized results. The code itself should be somewhat straightforward but it is long. I've done some commenting, but I wouldn't consider it properly documented yet.

Hi,
I am new on this forum, i have seen my question being asked by others but i cant see the answers, I would like to construct a rolenz curve and be able to culculate Gini coefficients too, I dont know what to do. my data is in school grades and the participation of different income groups.

Regards

What is the formula for "coefficient of variation", CV?
--
wdy

Hello,

I am looking for a formula that given an array of numbers - y's and x's will return the "a" and "b" coefficients matching the approporiate power regression function y = a*x^b, the same function that the trendline would have shown if I plotted the chart and added a trendline + equation.

I don't want to use the regular trendline and then get the chart since I have many sets of numbers and I want to produce the regression for each set automatically - using a function that will return the power function coefficients.

I tried some add-ins but so far managed to get coefficients only for exponential functions, using function "LOGEST", I need the similar function that returns the coefficients for the matching power function.

PLEASE HELP!!!

Does anyone know of a way to use #N/A in a formula and get a result other than #N/A?

=IF(H1=#N/A,"1",HLOOKUP(H1,B15:K16,2,FALSE))

Another way to fix my problem would be to have SEARCH look for 0 (zero) and actually find it instead of returning #VALUE!. Is it possible to SEARCH for 0?

=SEARCH(0,F1,5)

This formula returns #VALUE! even when F1 has a fifth character of zero.

Any suggestions?

Just in case, here's a copy of the spreasheet. As long as the "Head Coefficient" ends in anything other than zero the "Head Multiplier" comes out fine. When the "Head Multiplier" end in zero I need Excel to look in column B. Sorry if this is confusing, it's confusing me too.

Thanks again,
Aaron

Hi everyone me again.

Does anyone have any examples how to a multiple regression with formulas?

I'm very new to this stuff and I'm learning. Right now I'm using a Data Analysis tool for regression. This what I did to get the multiple regression.

1. Click on "Tools" then "Data analysis" then choose regression

2. Click on Y range icon - input (highlight) the Y range (dependent variable) - hit "Enter"

3. Click on X range icon - input (highlight) the X range (4 independent variables) - hit "Enter"

4. Check the Labels box (make sure you have included the top labels row in the input ranges)

5. Under "output options" check "New Workshhet Ply"

6. check the "Residuals" box

Hopefully I did that right. But my boss would like me to make template with formulas. Take a look at the example below in cells D4:D8 I did some research and I came up with those formulas but its not comming out the same as the Summary Ouput results. Or are they? I just dont know.

Does anyone knows if they are? Also does anyone knows how to do this with formulas?

******** ******************** ************************************************************************>Microsoft Excel - Alfredo2.xls___Running: 11.0 : OS = Windows XP (F)ile (E)dit (V)iew (I)nsert (O)ptions (T)ools (D)ata (W)indow (H)elp (A)boutD4D5D6D7D8=
ABCDEFGHI1SUMMARY OUTPUT 2 3Regression Statistics 4Multiple R0.970477245 0.917106137 5R Square0.941826083 0.841083666 6Adjusted R Square0.929578943 0.826636727 7Standard Error0.088765731 0.140515269 8Observations24 24 9 10ANOVA 11 dfSSMSFSignificance F 12Regression42.4237440.60593589776.901711.83E-11 13Residual190.1497080.007879355 14Total232.573451 15 16 CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%17Intercept0.537305770.0327316.416416891.11E-120.4688020.605810.4688020.605811800.0563581310.0186683.0189228860.0070590.0172850.0954310.0172850.0954311900.0364543470.0115763.1490475960.0052840.0122250.0606840.0122250.0606842000.0228723320.003436.6682761972.24E-060.0156930.0300510.0156930.0300512100.0054324850.0030141.8024950180.087354-0.000880.011741-0.000880.011741Sheet1
[HtmlMaker 2.42] To see the formula in the cells just click on the cells hyperlink or click the Name box
PLEASE DO NOT QUOTE THIS TABLE IMAGE ON SAME PAGE! OTHEWISE, ERROR OF JavaScript OCCUR.

In Excel 2000 (and earlier), this example.
A B C D
1 1 1^2 1^3 2.3
2 2 2^2 2^3 4.2
3 3 3^2 3^3 4.7
4 4 4^2 4^3 5.1
5 . . . and so on

=INDEX(LINEST(D15,A1:C5),1) gives the first coefficient for a 3rd order
polynomial fit to the data in column D.
=INDEX(LINEST(D15,A1:C5),2) gives the 2nd,
=INDEX(LINEST(D15,A1:C5),3) the 3rd
=INDEX(LINEST(D15,A1:C5),4) gives the 4th. In the form ax^3+bx^2+cx+d, I
can find the fit for the data. The above formulas give me a,b,c and d.
(This can also be used as a single array formula, dropping the "INDEX" part)
Now in Excel 2003, the above formulas will only return a 2nd order fit. the
results are ax^2+bx+c. The first coefficient is 0.
Any thoughts on how generate the third (or forth, or fifth, or sixth) order
fit. This is driving me crazy because it is one of the things I use
constantly. I will be forced to go back to 2000 if I can't resolve this. (And
the idiotic "fx" in place of "=" in the formula bar)

Cheers

Incoherent

I have no VBA experience and have been trying to modify a code I found
online used to extract all coefficients from the trendline textbox.
I've pretty much gotten nowhere. The problem is that it's written to
extract only the coefficients of trendline 1 in chart 1. What I would
like is to extract only the m coefficient of mx+b of all of the
trendlines in however many charts I may have in the sheet. The charts
each have 4 trendlines in them. I would like for chart 1 tline 1
extracted to D3 and tline 2 to E3. Then tline 3 to D4 and tline 4 to
E4. Chart 2 would have tline 1 to D5 and tline 2 to E5. See the
pattern? Every successive chart would have tlines 1 and 2 directly
below the previous charts 3 and 4. Also, these must be extracted to
the above cells in sheet 2. I can have as many as 20 charts in a sheet
all of which have 4 trendlines each which need to have the slopes
extracted. copy and paste gets kind of tedious after a while. Any
help would be extremely appreciated. Here's what I tried to work from:

Sub GetFormula()
Dim sStr As String, sStr1 As String
Dim sFormula As String, j As Long
Dim i As Long
Dim ser As Series, sChar As String
Dim tLine As trendline
Dim cht As Chart
Dim rng As Range
Dim varr()
ReDim varr(1 To 10)
Set cht = ActiveSheet.ChartObjects(1).Chart
For Each ser In cht.SeriesCollection
If ser.trendlines.Count = 1 Then
Set tLine = ser.trendlines(1)
If tLine.DisplayEquation Then
sFormula = tLine.DataLabel.Text '

I frequently like to calculate a weighted correlation coefficient. To do so I use this array formula, where the values I want to correlate are in columns A and B, and the weights are in column C (and assuming there are just 4 rows for this example):

{=(SUM(C2:C5*(A2:A5-SUMPRODUCT(C2:C5*A2:A5)/SUM(C2:C5))*(B2:B5-SUMPRODUCT(C2:C5*B2:B5)/SUM(C2:C5))))/(SQRT(SUM(C2:C5*(A2:A5-SUMPRODUCT(C2:C5*A2:A5)/SUM(C2:C5))^2)*SUM(C2:C5*(B2:B5-SUMPRODUCT(C2:C5*B2:B5)/SUM(C2:C5))^2)))}

I do not know how to create VBA functions, but I would like to do so because it would make it easier to reuse this function on different data. Also it would make it easier to put in conditions to limit the calculation to only certain rows within the range.

Would someone be willing to translate this into a VBA function for me? It would be great if I could just use a function like wtcorrel(A2:A5,B2:B5,C2:C5). An added feature that would be great would be if the function would ignore rows in the range where one or more of the cells was blank. Thanks.

I have a series of stock price data ordered by date.
i want to perform what is known as a spearman correlation coefficient against this unordered data over 20 weeks ( variable number) against what is a sorted descending series for the last 20 weeks ( ie a perfect upward trand of all the observed data points for the last 20 weeks.
each week should look back 20 and recalculate every week showing a running cprrelation to a perfetc trend.

there must be a way of doing this efficiently in one formula and to sort in an automated way - in teh attached spreadsheet i did it manually - need help too automate ideally into th ecorrelation formula somehow?

please help

In Excel 2000 (and earlier), this example.
A B C D
1 1 1^2 1^3 2.3
2 2 2^2 2^3 4.2
3 3 3^2 3^3 4.7
4 4 4^2 4^3 5.1
5 . . . and so on

=INDEX(LINEST(D1:D5,A1:C5),1) gives the first coefficient for a 3rd order
polynomial fit to the data in column D.
=INDEX(LINEST(D1:D5,A1:C5),2) gives the 2nd,
=INDEX(LINEST(D1:D5,A1:C5),3) the 3rd
=INDEX(LINEST(D1:D5,A1:C5),4) gives the 4th. In the form ax^3+bx^2+cx+d, I
can find the fit for the data. The above formulas give me a,b,c and d.
(This can also be used as a single array formula, dropping the "INDEX" part)
Now in Excel 2003, the above formulas will only return a 2nd order fit. the
results are ax^2+bx+c. The first coefficient is 0.
Any thoughts on how generate the third (or forth, or fifth, or sixth) order
fit. This is driving me crazy because it is one of the things I use
constantly. I will be forced to go back to 2000 if I can't resolve this. (And
the idiotic "fx" in place of "=" in the formula bar)

Cheers

Incoherent

Hi everyone please, please help!
I have created a formula to calculate a final price after all variables have been selected from drop down menus.
I am trying to set a coefficient to change the formula by selecting a choice from another drop down manu. For example if the formula is C2+C2*C4 I want this formula to be multiplied or altered depending on a list of colours. So lets say green is 1.07
red is 1.05
blue 1.03 and
black 1.02.
When one of these colours is selected from the drop down menu I want all of the final calculation formula to be multiplied by the set coffiecint for the corresponding colour. How can I do that? and also is it possible to do so in a multiple times?

I am in deperate help.

Thanks

Hi Guys,

I am doing some statistical work with Excel 2007 and have been using the data analysis add-in to do some regression. In terms of a standard linear regression, one can enter the =slope(known y's , known x's) formula to find the regression coefficient for x. My regression is slightly more complicated as it has multiple independent variables (i.e. the y variable is not just a function of x, but also of another variable, say z). I can perform a multiple regression easily but I would like to be able to store the coefficients of the multiple regression as I must eventually add the coefficient estimate for x and z together. The multiple regression output is in a standard format however as I have to repeat this process for 1500 firms (one regression for each) it would be much more convenient if excel just reported the two coefficients for each regression. That way I could copy a formula across all firms and not use the standard regression window 1500 times.

Is there anyway I can perform the multiple regression formulaically, outputting just the two coefficients I require from each regression?

I apologise for the long winded post. Any help is very much appreciated. Thanks in advance.

Many thanks to Andy Pope for posting the links to this code. I'm trying to get the coefficients of several trend lines, on multiple sheets. I don't fully understand all of this code, but I get the basics. My first question is, "Where does the code output the coefficients of the trend lines?"

Option Explicit
Option Base 0
'Function TLcoef(...) returns Trendline coefficients
'Function TLeval(x, ...) evaluates the current trendline at a given x
'
'The arguments of TLcoef, and the last 4 of TLeval: _
    vSheet is the name/number of the sheet containing the chart. _
        Use of the name (as in the Sheet's tab) is recommended _
    vCht is the name/number of the chart. To see this, deselect _
        the chart, then shift-click it; its name will appear in the _
        drop-down list at the left of formula bar.  In the case of a _
        chart in its own chartsheet, specify this as zero or the zero _
        length string "" _
    VSeries is a series name/number, and vTL is the series' trendline _
        number. If the series has a name, it is probably better to _
        specify the name. To determine the name/number, as well as _
        the trendline number needed for vTL, pass the mouse arrow _
        over the trendline. Of course, if there is only one series in _
        the chart, you can set vSeries = 1, but beware if you add _
        more series to the chart.
 
'First draft written 2003 March 1 by D J Braden _
Revisions by Tushar Mehta (www.tushar-mehta.com) 2005 Jun 19: _
    Various documentation changes _
    vCht is now 'optional' _
    Correctly handles cases where a term is missing -- e.g., _
        y = 2x3 + 3x + 10 _
    Correctly handles cases where a coefficient is not shown because _
        it is the default value -- e.g., y = Ln(x)+10 _
    When only the constant term is present, the original function _
        returned it in the correct array element only for the _
        polynomial and linear fits. Now, the function returns it in _
        the correct array element for other types also.  For example, _
        for an exponential fit, y=10 will be returned as (10,0) _
    Arrays are now base zero.
'Limitations: _
    The coefficients are returned to precision *displayed* _
        To get the most accurate values, format the trendline label _
        to scientific notation with 14 decimal places. (Right-click _
        the label to do this) _
    Given how XL calculation engine works -- recalculates the _
        worksheet first, then the chart(s) -- it is eminently _
        possible for the chart to show one trendline and the _
        function to return coefficients corresponding to the values _
        shown by the chart *prior* to the recalculation.  To see the _
        effect of this '1 recalculation cycle lag' plot a series of _
        random numbers. _
    An alternative to the functions in this module is the LINEST _
        worksheet function.  Except for those few cases where LINEST _
        returns incorrect results, it is the more robust function _
        since it doesn't suffer from the '1 recalculation cycle' _
        lag.  With XL2003 LINEST may even return more accurate _
        results than the trendline.
 
Function TLcoef(vSheet, vCht, vSeries, vTL)
    'To get the coefficients of a chart on a chartsheet, specify vCht _
     as zero or the zero length string ""
 
    'Return coefficients of an Excel chart trendline. _
     Limitations: See the documentation at the top of the module _
    'Note: For a polynomial fit, it is possible the trendline doesn't _
     report all the terms.  So this function returns an array of _
     length (1 + the order of the requested fit), *not* the number of _
     values displayed.  The last value in the returned array is the _
     constant term; preceeding values correspond to higher-order x.
    Dim o As Trendline
    Application.Volatile
    If ParamErr(TLcoef, vSheet, vCht, vSeries, vTL) Then Exit Function
    On Error Resume Next
    If vCht = "" Or vCht = 0 Then
        If TypeOf Sheets(vSheet) Is Chart Then
            Set o = Sheets(vSheet).SeriesCollection(vSeries) _
                .Trendlines(vTL)

        Else
            TLcoef = "#Err: vCht can be omitted only if vSheet is a " _
                & "chartsheet"
            Exit Function   '*****
            End If
    Else
        Set o = Sheets(vSheet).ChartObjects(vCht).Chart. _
            SeriesCollection(vSeries).Trendlines(vTL)
        End If
    On Error GoTo 0
    If o Is Nothing Then
        TLcoef = "#Err: No trendline matches the specified parameters"
    Else
        TLcoef = ExtractCoef(o)
        End If
    End Function
 
Function TLeval(vX, vSheet, vCht, vSeries, vTL)
    'DJ Braden
    'Exp/logs are done for cases xlPower and xlExponential to _
     allow for greater range of arguments.
    Dim o As Trendline, vRet
 
    Application.Volatile
    If ParamErr(TLeval, vSheet, vCht, vSeries, vTL) Then Exit Function
    On Error Resume Next
    If vCht = "" Or vCht = 0 Then
        If TypeOf Sheets(vSheet) Is Chart Then
            Set o = Sheets(vSheet).SeriesCollection(vSeries) _
                .Trendlines(vTL)
        Else
            TLeval = "#Err: vCht can be omitted only if vSheet is a " _
                & "chartsheet"
            Exit Function   '*****
            End If
    Else
        Set o = Sheets(vSheet).ChartObjects(vCht).Chart. _
            SeriesCollection(vSeries).Trendlines(vTL)
        End If
    On Error GoTo 0
    If o Is Nothing Then
        TLeval = "#Err: No trendline matches the specified parameters"
        Exit Function
        End If
    vRet = ExtractCoef(o)
    If TypeName(vRet) = "String" Then TLeval = vRet: Exit Function
    Select Case o.Type
    Case xlLinear
        TLeval = vX * vRet(LBound(vRet)) + vRet(UBound(vRet))
    Case xlExponential 'see comment above
        TLeval = Exp(Log(vRet(LBound(vRet))) + vX * vRet(UBound(vRet)))
    Case xlLogarithmic
        TLeval = vRet(LBound(vRet)) * Log(vX) + vRet(UBound(vRet))
    Case xlPower 'see comment above
        TLeval = Exp(Log(vRet(LBound(vRet))) _
            + Log(vX) * vRet(UBound(vRet)))
    Case xlPolynomial
        Dim Idx As Long
        TLeval = vRet(LBound(vRet)) * vX + vRet(LBound(vRet) + 1)
        For Idx = LBound(vRet) + 2 To UBound(vRet)
            TLeval = vX * TLeval + vRet(Idx)
            Next Idx
        End Select
    End Function
 

    Private Function DecodeOneTerm(ByVal TLText As String, _
            ByVal SearchToken As String, _
            ByVal UnspecifiedConstant As Byte)
        'splits {optional number}{SearchToken} _
            {optional numeric constant}
        Dim v(1) As Double, TokenLoc As Long
        TokenLoc = InStr(1, TLText, SearchToken, vbTextCompare)
        If TokenLoc = 0 Then
            v(1) = CDbl(TLText)
        Else
            If TokenLoc = 1 Then v(0) = 1 _
            Else v(0) = Left(TLText, TokenLoc - 1)
            If TokenLoc + Len(SearchToken) > Len(TLText) Then _
                v(1) = UnspecifiedConstant _
            Else v(1) = Mid(TLText, TokenLoc + Len(SearchToken))
            End If
        DecodeOneTerm = v
        End Function

    Private Function getXPower(ByVal TLText As String, _
            ByVal XPos As Long)
        If XPos = Len(TLText) Then
            getXPower = 1
        ElseIf IsNumeric(Mid(TLText, XPos + 1, 1)) Then
            getXPower = Mid(TLText, XPos + 1, 1)
        Else
            getXPower = 1
            End If
        End Function
 
Private Function ExtractCoef(o As Trendline)
    Dim XPos As Long, s As String
    On Error Resume Next
    s = o.DataLabel.Text
    On Error GoTo 0
    If s = "" Then
        ExtractCoef = "#Err: No trendline equation found"
        Exit Function       '*****
        End If
    If o.DisplayRSquared Then s = Left$(s, InStr(s, "R") - 2)
    s = Trim(Mid(s, InStr(1, s, "=", vbTextCompare) + 1))
    Select Case o.Type
    Case xlMovingAvg
    Case xlLogarithmic
        ExtractCoef = DecodeOneTerm(s, "Ln(x)", 0)
    Case xlLinear
        ExtractCoef = DecodeOneTerm(s, "x", 0)
    Case xlExponential
        s = Application.WorksheetFunction.Substitute(s, "x", "")
        ExtractCoef = DecodeOneTerm(s, "e", 1)
    Case xlPower
        ExtractCoef = DecodeOneTerm(s, "x", 1)
    Case xlPolynomial
        Dim lOrd As Long
        ReDim v(o.Order) As Double
        s = Application.WorksheetFunction.Substitute(s, " ", "")
        s = Application.WorksheetFunction.Substitute(s, "+x", "+1x")
        s = Application.WorksheetFunction.Substitute(s, "-x", "-1x")
        Do While s <> ""
            XPos = InStr(1, s, "x")
            If XPos = 0 Then
                v(o.Order) = s 'constant term
                s = ""
            Else
                lOrd = getXPower(s, XPos)
                If XPos = 1 Then v(UBound(v) - lOrd) = 1 _
                Else _
                    v(UBound(v) - lOrd) = Left(s, XPos - 1)
                If XPos = Len(s) Then
                    s = ""
                ElseIf IsNumeric(Mid(s, XPos + 1, 1)) Then
                    s = Trim(Mid(s, XPos + 2))
                Else

                   s = Trim(Mid(s, XPos + 1))
                    End If
                End If
            Loop
        ExtractCoef = v
        End Select
    End Function

Private Function ParamErr(v, ParamArray parms())
    Dim l As Long
    For l = LBound(parms) To UBound(parms)
        If VarType(parms(l)) = vbError Then
            v = parms(l)
            ParamErr = True
            Exit Function
            End If
        Next l
    End Function
Any help would be appreciated. This one is WAY over my head.

Thanks,

-gshock

Hello,

I'm trying to extract the coefficients from the text box created when doing a trendline on an excel chart. I've attempted 2 solutions from this forum, neither of which I can get to work.

(a) I've tried Dave Braden's VBA code that creates function TLcoef, but it returns a "value" error. I believe I've fixed the word wrap issues, but I'm not a VBA programmer, so can't be sure. I've also tried entering it as an array formula, to no avail. I've pasted Dave's code at bottom of this post.

(b) I've also tried Tom Ogilvy's macro solution. this "almost" works. the problem is that my equation has a negative coefficient, and his code does not account for negative's, only positives. I've modified the code slightly, so that the negative coefficient is listed in the spreadsheet, but I can't figure out how to preserve / display this value as a negative. in other words, all the coefficients are assumed positive. the code mod I made is to simply add the line :

sFormula = Application.Substitute(sFormula, " - ", ",")

Any help would be appreciated.

Thank You,
=========================================
Const cFirstNumPos = 5 ' pos. of first integer in displayed eqn
Const cMaxFormat = "0.00000000000000E+00"

Function TLcoef(vSheet, vCht, vSeries, vTL)

'Return coefficients of an Excel chart trendline, *to precision displayed*
'
'Note: While Trendline seemingly always reports subsequent terms from
'a given one on, sometimes it reduces the order of the fit. So this function
'returns, for a poly-fit, an array of length 1 + the order of the requested fit,
' *not* the number of values displayed. The last value in the return array
'is the constant term; preceeding values correspond to higher-order x.

Dim o As Trendline

Application.Volatile
If ParamErr(TLcoef, vSheet, vCht, vSeries, vTL) Then Exit Function
On Error GoTo HanErr
Set o = Sheets(vSheet).ChartObjects(vCht).Chart.SeriesCollection(vSeries).Trendlines(vTL)
TLcoef = ExtractCoef(o, cFirstNumPos)
Exit Function

HanErr:
TLcoef = CVErr(xlErrValue)
End Function

Function TLeval(vX, vSheet, vCht, vSeries, vTL)
'DJ Braden
' Exp/logs are done for cases xlPower and xlExponential to allow
' for greater range of arguments.
Dim o As Trendline, vRet

Application.Volatile
' If Not CheckNum(vX, TLeval) Then Exit Function
If ParamErr(TLeval, vSheet, vCht, vSeries, vTL) Then Exit Function

Set o = Sheets(vSheet).ChartObjects(vCht).Chart.SeriesCollection(vSe*ries).Trendlines(vTL)

vRet = ExtractCoef(o, cFirstNumPos)
Select Case o.Type
Case xlLinear
vRet(1) = vX * vRet(1) + vRet(2)
Case xlExponential 'see comment above
vRet(1) = Exp(Log(vRet(1)) + vX * vRet(2))
Case xlLogarithmic
vRet(1) = vRet(1) * Log(vX) + vRet(2)
Case xlPower 'see comment above
vRet(1) = Exp(Log(vRet(1)) + Log(vX) * vRet(2))
Case xlPolynomial
Dim l As Long
vRet(1) = vRet(1) * vX + vRet(2)
For l = 3 To UBound(vRet)
vRet(1) = vX * vRet(1) + vRet(l)
Next
End Select
TLeval = vRet(1)
Exit Function

HanErr:
TLeval = CVErr(xlErrValue)
End Function

Private Function ExtractCoef(o As Trendline, ByVal lLastPos As Long)
Dim lCurPos As Long, s As String

s = o.DataLabel.Text

If o.DisplayRSquared Then
lCurPos = InStr(s, "R")
s = Left$(s, lCurPos - 1)
End If

If o.Type <> xlPolynomial Then
ReDim v(1 To 2) As Double

If o.Type = xlExponential Then
s = Application.WorksheetFunction.Substitute(s, "x", "")
s = Application.WorksheetFunction.Substitute(s, "e", "x")
ElseIf o.Type = xlLogarithmic Then
s = Application.WorksheetFunction.Substitute(s, "Ln(x)", "x")
End If

lCurPos = InStr(1, s, "x")
If lCurPos = 0 Then
v(2) = Mid(s, lLastPos)
Else
v(1) = Mid(s, lLastPos, lCurPos - lLastPos)
v(2) = Mid(s, lCurPos + 1)
End If

Else 'have a polynomial
Dim lOrd As Long
ReDim v(1 To o.Order + 1) As Double

lCurPos = InStr(s, "x")
If lCurPos = 0 Then
v(o.Order + 1) = Mid(s, lLastPos)
Exit Function 'with single constant term
End If
'else
lOrd = Mid(s, lCurPos + 1, 1)
Do While lOrd > 1
v(UBound(v) - lOrd) = Mid(s, lLastPos, lCurPos - lLastPos)
lLastPos = lCurPos + 2
lCurPos = InStr(lLastPos, s, "x")
lOrd = lOrd - 1
Loop
'peel off coeffs. for affine terms in eqn
v(o.Order) = Mid(s, lLastPos, lCurPos - lLastPos)
v(o.Order + 1) = Mid(s, lCurPos + 1)
End If
ExtractCoef = v
End Function

Private Function ParamErr(v, ParamArray parms())
Dim l As Long
For l = LBound(parms) To UBound(parms)
If VarType(parms(l)) = vbError Then
v = parms(l)
ParamErr = True
Exit Function
End If
Next
End Function
=========================================

Hello,

I am having trouble trying to figure out a formula that will ultimately bring back one value by looking up two values in an array of data.

A B C D

A 1 4 $10.00
A 4 10 $20.00
A 10 20 $30.00
B 1 4 $15.00
B 4 10 $25.00
B 10 20 $35.00
C 1 4 $17.00
C 4 10 $27.00
C 10 20 $37.00

Values to lookup
A 6
B 15
C 2

The data on the left is the table that needs to be looked up from. Basically for the first example I am trying to find the dollar amount for A and the range that 6 falls in.. which would be $20.00. For the second set I need to find B and the range that 15 falls in.. which would be $35.00.

I already have a formula to find the row number in a range of data, but that becomes useless when there are multiple ranges with different coefficients such as A, B, and C.

=SUMPRODUCT(--(B4:B6<=H4)*(C4:C6>=H4),ROW(B4:B6))-3

Newly upgraded to 2007 and a formula that was working perfectly in 2003 no longer is working for me. Kind of crazy but it worked well and it greatly simplified my own programming tasks.

What I was previously able to work with was the following formula:

=INDIRECT(RIGHT(CONCATENATE($B$2,F$4,$B6),31))

Where
$B$2 = '='[Regression Specs.xlsx]
F$4 = AK
$B6 = ''!$A5

and the result returned would be from the formula:

='[Regression Specs.xlsx]AK'!$A5

and would show the result which in this case would be a regression coefficient of 1.83 that I could reference in a computation later.

Anyone have any ideas how to make this work in 2007? Its now just giving me the #REF! result. Removing the indirect function returns: ='[Regression Specs.xlsx]AK'!$A5