Examples of the functionality include:

- Integration (scipy.integrate)
- Optimization/Fitting (scipy.optimize)
- Interpolation (scipy.interpolate)
- Fourier Transforms (scipy.fftpack)
- Signal Processing (scipy.signal)
- Linear Algebra (scipy.linalg)
- Spatial data structures and algorithms (scipy.spatial)
- Statistics (scipy.stats)
- Multi-dimensional image processing (scipy.ndimage)

and so on.

`curve_fit`

, which is imported as follows:

In [ ]:

```
import numpy as np
from scipy.optimize import curve_fit
```

`curve_fit`

is available here, and we will look at a simple example here, which involves fitting a straight line to a dataset.

We first create a fake dataset with some random noise:

In [ ]:

```
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
```

In [ ]:

```
x = np.random.uniform(0., 100., 100)
y = 3. * x + 2. + np.random.normal(0., 10., 100)
plt.plot(x, y, '.')
```

In [ ]:

```
def line(x, a, b):
return a * x + b
```

`x`

, followed by the parameters. We can now call `curve_fit`

to find the best-fit parameters using a least-squares fit:

In [ ]:

```
popt, pcov = curve_fit(line, x, y)
```

`curve_fit`

function returns two items, which we can `popt`

and `pcov`

. The `popt`

argument are the best-fit paramters for `a`

and `b`

:

In [ ]:

```
popt
```

which is close to the initial values of `3`

and `2`

used in the definition of `y`

.

The reason the values are not exact is because there are only a limited number of random samples, so the best-fit slope is not going to be exactly those used in the definition of `y`

. The `pcov`

variable contains the *covariance* matrix, which indicates the uncertainties and correlations between parameters. This is mostly useful when the data has uncertainties.

In [ ]:

```
e = np.repeat(10., 100)
plt.errorbar(x, y, yerr=e, fmt="none")
```

In [ ]:

```
popt, pcov = curve_fit(line, x, y, sigma=e)
```

In [ ]:

```
popt
```

`pcov`

will contain the true variance and covariance of the parameters, so that the best-fit parameters are:

In [ ]:

```
print("a =", popt[0], "+/-", pcov[0,0]**0.5)
print("b =", popt[1], "+/-", pcov[1,1]**0.5)
```

We can now plot the best-fit line:

In [ ]:

```
plt.errorbar(x, y, yerr=e, fmt="none")
xfine = np.linspace(0., 100., 100) # define values to plot the function for
plt.plot(xfine, line(xfine, popt[0], popt[1]), 'r-')
```

`curve_fit`

will be good enough for most simple cases.

Note that there is a way to simplify the call to the function with the best-fit parameters, which is:

```
line(x, *popt)
```

The * notation will expand a list of values into the arguments of the function. This is useful if your function has more than one or two parameters. Hence, you can do:

In [ ]:

```
plt.errorbar(x, y, yerr=e, fmt="none")
plt.plot(xfine, line(xfine, *popt), 'r-')
```

**Important Note:** the way `curve_fit`

determines the uncertainty is to actually renormalize the errors so that the reduced $\chi^2$ value is one, so the magnitude of the errors doesn't matter, only the relative errors. In some fields of science (such as astronomy) we do *not* renormalize the errors, so for those cases you can specify `absolute_sigma=True`

in order to preserve the original errors.

In the following code, we generate some random data points:

In [ ]:

```
x = np.random.uniform(0., 10., 100)
y = np.polyval([1, 2, -3], x) + np.random.normal(0., 10., 100)
e = np.random.uniform(5, 10, 100)
```

Fit a line and a parabola to it and overplot the two models on top of the data:

In [ ]:

```
# your solution here
```

In [ ]:

```
# The following code reads in the file and removes bad values
import numpy as np
date, temperature = np.loadtxt('data/munich_temperatures_average_with_bad_data.txt', unpack=True)
keep = np.abs(temperature) < 90
date = date[keep]
temperature = temperature[keep]
```

Fit the following function to the data:

$$f(t) = a~\cos{(2\pi t + b)} + c$$where $t$ is the time in years. Make a plot of the data and the best-fit model in the range 2008 to 2012. What are the best-fit values of the parameters? What is the overall average temperature in Munich, and what are the typical daily average values predicted by the model for the coldest and hottest time of year? What is the meaning of the `b`

parameter, and does its value make sense?

In [ ]:

```
# your solution here
```