October 24, 2017 Filed in: Articles

In many experiments students collect two-variable data, make scatter plots, and then try to find the line of best fit so they can talk about how two variables are related. Microsoft Excel has a built-in function that readily does this. Just enter the data, create a scatter plot, right-click on any piece of data on the graph, and click on the

Did you notice the green dots don’t have a fitted curve running through them? That data is too complex for the Add Trendline feature; but Excel does have another feature you can use to fit curves to data. The rest of this article sets out to explain how to use it for yourself.

The need to analyze and fit complex data like those green dots arises with the dampened harmonic oscillator, which is encountered in grade 12 university level physics. For example, the amplitude of swing of a pendulum experiencing friction can be described by the following equation.

http://hyperphysics.phy-astr.gsu.edu/hbase/oscda.html

In second-year of university in classical mechanics, you may have encountered a projectile, with air resistance, which has a trajectory described by the following equation.

http://mistersyracuse.com/uploads/3/0/8/0/3080275/projectile_motion_with_air_resistance.pdf

Also, if you have ever studied spectroscopy, you may have done labs involving complex curves that ended up looking something like this.

Below is a graph with three blue data points. A straight line has been drawn through the data; however, that line might not provide the best fit for the data. To find the line of best fit, we draw straight lines, shown in orange, vertically up or down from each point until they intersect the line of fit (often referred to as

Now think of the areas of those squares as a measure of how far away the line is from where it ought to be. If you add up the areas of all those squares, then you would get a number which represents how far the fitted line is from all the data points. The smaller that sum, the better, which is why the method is referred to as the

Excel does a least-squares fit automatically when you click the Add Trendline option, but for complex data, you’ll need to know how to do this in Excel manually. For simplicity, we’ll start by trying to do a least-squares fit on data that happens to be linear, and then we’ll try a more complex fit afterwards. If you’re reading this on your phone, go get your computer, and follow along! If you’re on your computer, open a new, blank Excel document. Copy the following data table and paste it into the cells A1 through B12.

If, when you pasted the data into Excel, you encountered some formatting issues, try pressing the Ctrl+Alt+V keys all at once instead of just Ctrl+V to paste without keeping the formatting. Select the data. Then click on the Insert tab at the top of Excel and insert a scatter plot of the data into your spreadsheet. After labelling the axes and inserting a legend, it should look something like this.

This graph looks like a linear fit could describe it reasonably well. By visual inspection, the line might have a y-intercept of about 2, and a slope close to 0.5. We’ll call that our ‘guess function’, that is,

Next, we should graph the guessed y-values onto the same graph as before. To do this, right-click on any data point in the original scatter plot, and then click on Select Data. Press the Add button, and for the Series Name, select cell C1; for the Series X Values, select cells A2 to A12 by selecting them with your mouse; and for the Series Y Values, select cells C2 to C12; then click on Ok. The Excel document should now look like this.

You can see from the graph that the guessed function could be improved by decreasing the slope a little. This is easy to fix, just change the value in the cell F1 from ‘0.5’ to ‘0.4’. Everything will change automatically, and the graph that we guessed earlier should fit the data better now, as shown below.

To do a least-squares fit we’ll need to calculate the square of the distance between each of the original y-values and the appropriate guessed y-values. Label cell D1 as ‘(y-yg)^2’, and in cell D2, write ‘=(B2-C2)^2’, followed by the Enter key. Again, click and drag the small square downwards to quickly do this for every x-value. Then we’ll sum the squares to get a measure of how good or bad our guess was at fitting the data points. To do that, label cell D13 as ‘sum’, and in cell D14, write ‘=SUM(D2:D12)’, followed by the Enter key. Now our Excel document looks like this.

- Click on the File button in the top left, then press Options at the bottom left
- In the pop-up window, click on the Add-ins tab, and where the at the bottom it says Manage: Excel Add-ins, press the GoÖ button (do not press the Ok button)
- In the next pop-up window, check the Solver Add-in checkbox, and hit Ok

If you’re having trouble activating Solver, try watching this demonstration:

If you’re using an older version of Excel, or are not using Windows, then you may need to enable Solver in a slightly different way. Try a Google search on how to enable Solver in Excel. The Solver add-in is on both Windows and Macs with Excel, and has been for many years, so not matter what version of Excel you have, Solver should be present.

Once the Solver Add-in is enabled, you can access it after selecting the Data tab (to the right of the File tab from before). Under the Data tab, click on Solver, found within the Analyze heading. A new pop-up window will appear. What you want Excel to do here is to minimize the sum of the squares, by altering the values of m and b from earlier. Set ‘Objective’ to ‘$D$14’, set ‘To’ to ‘Min’, set ‘By Changing Variable Cells’ to ‘$F$1:$F$2’, then press the Solve button. You may have noticed a checkbox which allows for negative variables, for now don’t worry about that because our y-intercept and slope are both positive. After clicking on the Solve button, you should notice the values of

Now that you have finished your first least-squares fit in Excel, let’s try and tackle that damped harmonic oscillator that was mentioned before. We’ll use an equation of this form to describe it.

We have initially guessed

Now adjust your values for the coefficients

Once you have a good guess (it is important to have a good initial guess), you can use the Solver command. Click Solver, then Set ‘Objective’ to ‘$D$29’, set ‘To’ to Min, set ‘By Changing Variable Cells’ to ‘$F$1:$F$5’, then press the Solve button. Your sum of the squares should decrease, and the variables should change slightly to the values shown in the next image.

And that’s it! Congratulations on performing a least-squares fit on a dampened harmonic oscillator.

In addition to physics courses, this kind of curve fitting can be performed in grade 12 advanced functions (MHF4U) in the combining functions section (unit D specific expectation 2). In that course I turn this into a project where my students use a camera to take a video of a swinging pendulum, use video analysis software like Logger Pro (available for a fee at https://www.vernier.com/products/software/lp/) or Tracker (available free at http://physlets.org/tracker/) to analyze the pendulum’s motion, and then use Excel to find an equation which describes the motion of the pendulum. The students should find that the data can be fit to a combination (specifically, the product) of an exponential function and a sinusoidal function. In case you’re not familiar with video analysis software, here’s an image of what Tracker looks like during the pendulum lab.

You can even repeat the pendulum experiment multiple times to see whether the frequency or damping coefficient changes depending on the initial displacement of the pendulum, or the length of the string. Click here to download one of the Excel worksheets that one of my students submitted. After you open the document, look at the very bottom tab. If Sheet1 is selected, you should see just the raw data. You can try to fit that data yourself if you are up to the challenge. Otherwise, on Sheet2 I’ve set it up so you just need to create a good initial guess for the variables before using Solver, which is not that difficult because the graph will adjust as you change the coefficients. Sheet3 has been fully solved and shows the optimal values for the variables. With the 1590 data points that the student had gathered, the least-squares curve of best fit came out looking very good, as you can see below.

Good luck performing least-squares fits in your classroom with your students!