Free Microsoft Excel 2013 Quick Reference

Normalize Data

How can I have excel automatically scan a column of data that contains "IDs" and return in another column a normalized data set?

Example: In the column "ID" I have 11 IDs listed with many duplicates. Without using a pivot table, how can I have excel list in a separate column only one instance of a duplicate such as below:

ID Column


Post your answer or comment

comments powered by Disqus
I am trying to normalize data sets in Excel to total 100%. I'm wondering if
there is a formula that can do this without having to correct each cell.

I was discussing something with a coworker regarding the normalization of data on a chart... I can't remember from my college years the formulas for normalizing data onto the same chart.

I have values that I want to compare all linked to the same time (x-axis).

Sales Dollars are in the 40,000-100,000 range
Assembly Minutes are in the 5,000 - 20,000 range
Late Shipments in the 0 - 20 range

I don't need the actual value, just a representative line on a graph.

I hope someone has some insight and can help!

Best Regards,

Hi all, my first post!

I have a small set of data I need to normalize, in order to compare on a graph.

The data includes over 30 days of; daily revenue, visits, advertising spend, etc etc. There are only five columns and 31 Rows. They are different types of data, that need normalized in order to compare over time.

I have the latest ver of Mac Excel 2011. I have read several articles, but none seem to work.

Does anyone have any advice? Thanks!


An account number should consist of 7 digits. Sheet 1-When the data was downloaded from an application any account that contains a zero in front of the number then it would drop the zero and becomes an 6 digits account (note: not all accounts have a zero in front). How can I normalize this so that when I try to reference/lookup to the correct account number it would know to match.

TIA for your efforts.

I have the following data in the left-most column and the results of a
formula in the next:

0.006149 0.710222
0.004668 0.676867
-0.00354 0.49206
-0.01529 0.227635
-0.01644 0.201561
-0.0195 0.132744
-0.01835 0.15867
-0.0254 0
-0.02151 0.087515
-0.01532 0.226867
-0.00885 0.372547
0 0.571774
0.011987 0.841663
0.016434 0.941773
0.01902 1

The formula normalizes the first column data so that the values in the
second column are valued between 1 and 0. The formula is:


My problem is that, in the event that there is a zero value in the column of
numbers (the numbers almost always include positive and negative values), I
would like the normalizing formula to show a zero value in the results
there. Note that the actual 0 datum when normalized through this range,
equals .571774.

Since the zero value would normally not be the middle value of the range of
numbers, I think if I could come up with a formula to normalize all the
values greater than 0 with a range of 0 to 1, then all negative numbers 0
to -1, and 0= 0, I'd have what I am looking for. I'd appreciate any thoughts
here and my thanks in advance. Brad


I have two columns of data from a model simulation showing soil moisture content curves.

Part of the dataset looks like this:

0.12 0.13411
0.118 0.13374
0.119 0.37171
0.118 0.3879
0.119 0.3879
0.29 0.3879
0.373 0.3879
0.374 0.38785
0.376 0.38691
0.379 0.38372
0.381 0.37924
0.382 0.37475
0.383 0.36861
0.385 0.36177
0.386 0.30808
0.272 0.26739
0.238 0.24377
0.214 0.23273
0.198 0.22055
0.187 0.21165
0.181 0.20745
0.173 0.203
0.17 0.19824

Where the left column is the observed and the right column is the fitted data. I looked at the standardize function and I don't think that is what's needed here. Is there a way to take the standard deviations or mean and make them the same for each dataset so that the simulated data can be shifted or "corrected" to appear closer to the observed data? The observed data I guess would be a reference dataset and would not be changed.

Thank you


I have a spreadsheet which is used as a template to upload data. The user
pastes in their data and, if the data validates ok, then a "flatfile" is output
for upload to a database.

As such I have data validation rules in place on various cells/columns across
the sheet. I managed this no problem at all. This validation is run by VBA code
from a command button as the data is ALWAYS pasted in, bypassing the normal data
entry validation. Most of the cells contents are items from various named drop
down lists from another tab on the sheet

Once the validation is complete I have written VBA to highlight the cells
that failed validation. This all works fine.

The problem is that now I have been asked to add validation to make sure that
certain "mandatory" fields are completed (i.e. Not empty). My issue is that to
do this I had to uncheck "Ignore blanks" in the validation rules options as some
cells CAN be blank while others are mandatory. This means I need to check the
previous validated cell to make sure it is not empty (to check the end of the
file has not been reached) and then apply the validation to the cell ONLY if the
previous cell was not empty.

I think I need to use the custom option in the validation rules but do not
know how to create a formula which only validates if the previous validated cell
is not null.

Any help or pointers would be greatly appreciated.


Not really sure how to word the title.

Basically I have a range of results in % ranging from 80% to 214%
Now I want to use this data elsewhere but want to adjust the results in the range that are above 100 to bring it to a score between 100 and 110.

I was going to create a multiple IF statement that reduced anything over X by 10%, anything over Y by 20%, etc. but this doesn't yield consistent results through the range.

I want to leave scores under 100 alone and adjust those above 100.

Possibly this is not even a specific excel question, more maths, but I need to use it in my spreadsheet.


My problem is similar to the one described in these threads, but their solution didn't work for me.

I have data imported from a database, which is the result of a questionnaire.
There are 10 questions and 4 possible answer. Ie. "very satisfied", "satisfied" etc. There's also the name and company of each entry.

I want to create a pivot table that look something like this:

very satis. satis. unsatis. very unsatis. q1 6 4 3 1 q2 4 3 etc. q3 ...

Right now i can get a the first column right by putting question 1 the the "Column labels".box and the name of the entrant in the "Values"-box of the pivot table field list.

I think it's because i don't have a field with just the 4 possible answers to put as the row label.

I need something where i can update from the database without needing more than a click on refresh, as others (not excel gurus!) will need to be able to update the data.

Any help would be greatly appreciated.

How can I change the formatting on a line graph to show data that is not relevant ?

I have a data series showing attendance at a classroom. The school is closed for a weeks holiday, then resumes. If I include a blank cell for this period I get missing data, if I use #N/A then the lines will join. But how can I show the normal data as a solid line and the holiday period as a dotted line for example.


Griff :o)

We're in the process of normalizing maintenance at my company based on past maintenance history. E.g. - if I have an 8 hour activity to clean and inspect a pump once every 1.5 years but the pump has never broken and we haven't seen anything wrong with it in 10 years, we're changing the work frequency to 3.5 years.

I'm not sure how to set up an equation to calculate normalized man hour savings per year; which is kind of tough since some of our activities are not going to be exactly 52 weeks.

I was thinking of setting up 2 columns, one with the old frequency of performance (1.5 years) and the other with the new frequency (3.5 years) and then depending on whatever my equation gave me as a result for man hour savings by going to 3.5 years, divide that number by 3.5 to get "normalized" data for my total yearly savings.

This way we can show the executives how much money we're saving by doing the maintenance optimization.

Thanks a lot for the help

I have a bunch of names in column A. They are a mixture of either normal data and some other ones are data that is hyperlinked.

Is there any way to have a formula in column B that will transfer all the data in column A that is not hyperlinked. If it is hyperlinked, then leave it blank.


data data
data data
data data

I've got a pivot table with three columns (general, general, currency) on a row next to two fields of normal data (currency, general). When I sort the table, three of the columns sort correctly with the first normal field but the final text field does not sort with them. How can I sort all five fields at once?

Hi Everyone,

Is there any way to display a data lable for the last point on series in a plot? The normal data label option labels all points and I only want the last.



I have a Pivot table but i don't have the data source used to create it, i
want to convert it back to a normal data table. how can i do this?

So I work in sales and I created a price quoting worksheet that allows me to pick from a drop down menu from a list of all my products.

The problem is that after I finish making all the selections and applying the proper pricing, I want to be able to lock down every cell so it can't be changed at all.

The standard sheet and workbook protection seem to protect the normal cells that just have normal data within them, but any cell that is a drop down menu from data validation still is able to be changed.

This kind of opens a legal loophole because a client can realistically take my quote and change the product choices to whatever they want them to be.

I want to lock these menu's down so they can't be changed from what I originally selected and saved.

Any thoughts?

Hi there,
I have three columns:
  A     B          C
1.2    3.3       3.4
3.2    4.2       4.0
2.4    3.5       4.2
5.3    5.1       5.9
I want to create a 2-D clustered column graph while normalizing to column C. I mean all values in column C should be set to 1 and the other two columns are normalized to 1.

I read some documents about how to normalize data in excel, but I can not apply their methods to my problem.

Thanks for any comment.

I am in the process of normalizing test results in my school district. I
have roughly 250 test scores which I want to norm. I am relatively new to
excel but many have suggested that I use this program to do the statistical
work for me. Any and all help will be appreciated.

I have a problem when i pasted over a value from another worksheet to the worksheet which has data validation appplied on all the cells.

For example, I want to copy the values from 'Sheet 3' to 'Sheet 1'.
Sheet 1 has the data validation with the condition such that a text length which doesn't have 9 digits will prompt an error message ' Pls enter 9 digits'.

There is a similar macro which i think can be used for this too. But im not too sure where i should edit to change to the condition that i want.

Sub postPasteValidation()
Dim validationCells As Range
Dim oneCell As Range
Dim userInput As Variant, promptStr As String
On Error Resume Next
Set validationCells = ActiveSheet.Cells.SpecialCells(xlCellTypeAllValidation)
On Error GoTo 0
If Not (validationCells Is Nothing) Then
    For Each oneCell In validationCells
        If Not (oneCell.Validation.Value) Then
            promptStr = oneCell.Address & " has a bad value in it." & vbCrLf _
                            & "Please enter the correct data per" & vbCrLf _
                            & oneCell.Validation.InputMessage
            userInput = Application.InputBox(prompt:=promptStr, Default:=oneCell.Value)
            If userInput <> False And oneCell.Value <> vbNullString Then oneCell.Value = userInput
        End If
    Next oneCell
End If
End Sub
When i try using the data validation on sheet 1 and do a normal data entry it works fine. Just have a problem with copying and pasting part. Any help will be much appreciated. Thanks.


I want to validate the data in each cell in a given column, based on the data in another column for the same row, then move to the next row and so on.


A1 = APPLE so B1 must be GREEN or RED.

I want to put the list of acceptable inputs in another worksheet so they can be referenced and amended easily.

Any errors identified would need to be counted and the row copied to a report worksheet

Can someone help?

Ps – I have tried using normal data validation but am unsure of how to have the input checked against another column entry. I would like to try out both methods if it can be done?


I am familiar with the normal "data validation - list" I have several named lists. I want the correct list to appear as drop down in the cell D3 depending of the value of cell C3

Example if you choose: drink in cell D3 you get the drop down list ;water;coffee....
But if you choose meal in D3 you get burger, hot-dog....

Have a long list and want to restrict the values depending on the first choice.

Hi, I have a pivottable as in the screenshot (PivotTable.JPG). How can I strip it of the pivottable formatting/layout and convert it into a pure data table with VBA (as in DesiredTable.JPG)?

I have highlighted in red in the desired (output) table the areas that require additional attention. In particular, PivotTables do not repeat field values if they are shared by several rows (e.g. we need to insert the value "A" into A3 of the desired table).

I know the desired table in this example can be typed out without VBA, but this is a simplified version of a much bigger table which will require VBA to systematically approach this (e.g. RowField1 will contain more values than just "A" and "B").

Many thanks.

Hi Everyone,

I need your assistance working on a large data set that I am working with. I am looking for a restrictive way of identifying the outliers in my data. I am currently using the median and an array formula to identify outliers that are two deviations from the Median Absolute Deviation. This gives me both positive and negative outliers. I used an array formula to distinguish between normal data and an outlier. However outlier shows up for both positive data and negative data. How do I distinguish between the two types of outliers?
Secondly, is there a better approach to finding outliers and how do I go about capturing the data.
Any assistance you have would be greatly appreciated.


Hi there,

I was wondering if one of you genius' out there could help me?

I have a sheet (see attached), where i want to input wagon numbers on sheet 1. (There is a named range on sheet 2 called wagonnumbers, used for data validation)

What i don't want is the users of this sheet to input a duplicate wagon number twice in the column.

I can put on normal data validation list, but that doesn't prevent duplicated entries.

I also figured out that i can prevent duplicate entries of wagons using data validation custom and typing in a Countif if formula, but that doesn't use the list drop downs that i desire.

I am looking at making the sheet as simple to use for the end user and don't really want to go down the VB macro route if possible.

In a nutshell, i'm after a data validation list that prevents duplicate entries, CAN THIS BE DONE???

Cheers Guys

No luck finding an answer? You could always try Google.