Curves - Patch Clamp Handbook

Curves are abundant in neuroscience. In particular curves you will typically see are exponential decay, sigmoid, logarithmic and linear. Exponential decays occur in PSC/PSPs (both rise and decay), membrane taus (current clamp), and membrane currents (Ih-currents). Sigmoid currents occur in dose-response curves (FI curves and Ih currents). Logarithmic curves occur in membrane vorage changes in spiking activity in cells. Linear curves are seen in IV curves. In neuroscince many of these curves are measuring some sort of dose-response or response-time relationship.

You have seen these curves in action in several chapters such as the Current Clamp and m/sEPSC. In this chapter we will delve into the specific parameters of the curves and how the curve_fit function in Scipy works.

Some quick basics that all curves typically have. Many equations for curves go from 0 to 1 or 1 to 0. However, most data we collect is not scaled like this. Curves typically have some scaling factor which means you divide or multiple the equation by a number to change the range of possible the equation can output. Curves also have shifting factors which are added or subtracted to the equation and move the equation up or down or to the left or right. This is really a core concept that I did not pick up early on but wish I had.

import numpy as np
from bokeh.io import output_notebook, show
from bokeh.layouts import column, row, layout
from bokeh.models import ColumnDataSource, CustomJS, Slider
from bokeh.plotting import figure, gridplot
from scipy import optimize, stats

output_notebook()

Exponential decay¶

Exponential decay occurs in processes where a value decreases proportionally to its current value. Exponential decay is primarily parameterized by the decay constant, $\lambda$ . $\lambda$ can be describe as a time constant, $\tau$ where $\tau = \frac{1}{\lambda}$ . $\tau$ , the time constant or the time if takes for the process to reach a value of ~1/3 or $\frac{1}{e}$ of the original value. The basic exponential equation is: $N_{0}e^{-\frac{t}{\tau}}$ . $N_{0}$ is the original value, and in electrophysiological terms is the amplitude of a PSC/PSP. The equation expects some things. One is that you value decay towards zero. The second is that typically $N_{0}$ is at time point zero. When we curve fit the mEPSCs in the m/sEPSC chapter we created a new time array that started at zero.

The decay equation can be scaled and shifted in several ways. $N_{0}$ can be positive or negative. $N_{0}$ is multiplicative scaling factor, since you leave it out of the equation and you will just get 1 for time value zero since $e^0 = 1$ . $\tau$ is a also just a multiplicative scaling factor but for time. You can shift the whole equation up or down by adding a constant if your decay does not converge towards zero but some other constant value. There is no way to just shift the equation along in time. That is because you are assumming $N_{0}$ is $time = 0$ and if you shift $t$ in the equations you change where you are measuring the decay from. To shift the equation in time you just add a constant to your x values after putting them through the equation. You can also add decay equations together to get double, triple or even more decays. However, adding more than 2 or 3 decays together can lead to over fitting and decreased interpretability.

Single exponential decay¶

Here is the equation for a single exponential decay:

y_{shift} + amplitude*e^{\frac{-x}{tau}}

(1)

We use $y_{shift}$ as the shifts the curve along the y axis acting as a constant offset, $x$ is the time array that you put into the equation, $amplitude$ is the starting value of the decay, and $tau$ is the decay constant you want to use for your equation.

def exp_decay(x, amplitude=1, tau=1, yshift=0):
    y = yshift + (amplitude * np.exp(-x / tau))
    return y


x = np.arange(300) / 10
source = ColumnDataSource({"x": x, "y": exp_decay(x, amplitude=15, tau=2.5)})

plot = figure(width=400, height=400)

plot.line(x, exp_decay(x, amplitude=15, tau=2.5), line_width=2, line_color="black")
plot.line("x", "y", source=source, line_width=3, line_alpha=0.6, line_color="magenta")

yshift = Slider(start=-10, end=10, value=0, step=0.25, title="Y shift")
decay_tau = Slider(start=0.75, end=50, value=2.5, step=0.25, title="Decay tau")
amplitude = Slider(start=-30, end=30, value=15, step=0.5, title="Amplitude")
length = Slider(start=10, end=70, value=30, step=0.25, title="Length")

callback = CustomJS(
    args=dict(
        source=source,
        decay_tau=decay_tau,
        amplitude=amplitude,
        length=length,
        yshift=yshift,
    ),
    code="""
    const len = Math.round(length.value * 10)
    const t_length = Array.from({ length: len }, (_, i) => i/10)
    const y = t_length.map(x => {
        return (amplitude.value * Math.exp(-x / decay_tau.value)) + yshift.value
    })
    source.data.x = t_length;
    source.data.y = y;
    source.change.emit();
""",
)

yshift.js_on_change("value", callback)
decay_tau.js_on_change("value", callback)
amplitude.js_on_change("value", callback)
length.js_on_change("value", callback)

show(row(plot, column(decay_tau, amplitude, length, yshift)))

Double exponential decay¶

You will see that you can get much more complicated shapes with a double exponential decay. Some shapes don’t even look like a regular decay. There are a couple ways you can parameterize the double exponential decay. You could have a y_shift for each decay or just one for both. Usually when we think a curve is a double exponential decay we are assuming that the amplitude of both decays has the same sign. Additionally, you could make the amplitude the same for each decay or allow them to be different. It starts to get very complicated. You can have any combination of factors you choose. For fitting PSC/PSP decays I usually include an amplitude and tau parameter for each decay but ensure that the amplitudes both have the same sign. In the example below I do not ensure the amplitudes are the same sign. You could also make it so that the fast tau cannot be slower than the slow tau. Lastly the equation I am using here is an additive double exponential decay you can also have a multiplicative which is used for create an PSC/PSP template. Here is the equation I am using:

(amplitude_{slow}*e^{\frac{-x}{tau_{slow}}})+(amplitude_{fast}*e^{\frac{-x}{tau_{fast}}})

(2)

There is for each decay we have an amplitude and tau. Compared to the single exponential decay we do not have a constant to shift decay but you could include it. I also encourage you to reparameterize the equation if you feel confident enough.

def db_decay(x, amplitude_fast=1, tau_fast=1, amplitude_slow=0, tau_slow=1):
    y = (amplitude_slow * np.exp(-x / tau_slow)) + (
        amplitude_fast * np.exp(-x / tau_fast)
    )
    return y


x = np.arange(300) / 10
source_db = ColumnDataSource(
    {
        "x": x,
        "y": db_decay(x, amplitude_fast=15, tau_fast=2.5),
        "y_slow": exp_decay(x, 1, 1),
        "y_fast": exp_decay(x, 15, 2.5),
    }
)

plot = figure(width=400, height=400, title="Double exponential decay")

plot.line(
    x, db_decay(x, amplitude_fast=15, tau_fast=2.5), line_width=2, line_color="black"
)
plot.line(
    "x", "y", source=source_db, line_width=3, line_alpha=0.6, line_color="magenta"
)
decay_slow = figure(width=300, height=200, title="Slow decay")
decay_slow.line("x", "y_slow", source=source_db,  line_width=3, line_alpha=0.6, line_color="black")
decay_fast = figure(width=300, height=200, title="Fast decay")
decay_fast.line("x", "y_fast", source=source_db, line_width=3, line_alpha=0.6, line_color="black")

tau_slow = Slider(start=0.75, end=50, value=1, step=0.25, title="Tau slow")
amplitude_slow = Slider(start=-30, end=30, value=1, step=0.5, title="Amplitude_slow")
tau_fast = Slider(start=0.75, end=50, value=2.5, step=0.25, title="Tau fast")
amplitude_fast = Slider(start=-30, end=30, value=15, step=0.5, title="Amplitude fast")
length = Slider(start=10, end=70, value=30, step=0.25, title="Length")

callback = CustomJS(
    args=dict(
        source=source_db,
        tau_slow=tau_slow,
        amplitude_slow=amplitude_slow,
        tau_fast=tau_fast,
        amplitude_fast=amplitude_fast,
        length=length,
    ),
    code="""
    const len = Math.round(length.value * 10)
    const t_length = Array.from({ length: len }, (_, i) => i/10)
    const y = t_length.map(x => {
        return ((amplitude_slow.value * Math.exp((-x) / tau_slow.value))
        +(amplitude_fast.value * Math.exp((-x) / tau_fast.value)))
    })
    const y_slow = t_length.map(x => {
        return (amplitude_slow.value * Math.exp(-x / tau_slow.value))
    })
    const y_fast = t_length.map(x => {
        return (amplitude_fast.value * Math.exp(-x / tau_fast.value))
    })
    source.data.x = t_length;
    source.data.y = y;
    source.data.y_slow = y_slow;
    source.data.y_fast = y_fast;
    source.change.emit();
""",
)

tau_slow.js_on_change("value", callback)
amplitude_slow.js_on_change("value", callback)
tau_fast.js_on_change("value", callback)
amplitude_fast.js_on_change("value", callback)
length.js_on_change("value", callback)

buttons = column(tau_slow, amplitude_slow, tau_fast, amplitude_fast, length)
show(layout([[plot, buttons], [decay_slow, decay_fast]]))

Sigmoid curve¶

You will often see sigmoidal curves in input-output experiments. Some of these experiments include FI curves for firing rate and current density curves for ion channels (like Ih channels). This is because there is often a minimal and maximal amount of voltage/current primarily due the receptors present on a cell as well as the membrane composition. The basic sigmoid curve equation is: $\frac{1}{1+e^{-x}}$ . This specific equation is actually the logistic function, but there are other functions that have sigmoidal shapes. You can further parameterize the equation by subtracting/adding a value to x to shift the equation along the x-axis. You can divide x by a number which is the slope. You can multiply the whole equation by a value to scale the output. Lastly you can add a value to the exquation to offset the axis. The full equation looks like this:

\frac{1}{1+e^\frac{x-m}{s}}*y+o

(3)

where $s=slope$ , $m=x_{offset}$ , $y=y_{scale}$ and $o=y_{offset}$ . You may also see the equation as: $bottom-\frac{bottom-top}{1+10^{(LogEC50-X)*HillSlope}}$ .

def sigmoid(x, max_value, midpoint, slope, offset):
    return 1 / (1 + np.exp((x - midpoint) / -slope)) * max_value + offset


x = np.linspace(0, 400)
y1 = sigmoid(x, 16.0, 200.0, 40.0, 0)
y2 = sigmoid(x, 16.0, 200.0, 40.0, 0)

source1 = ColumnDataSource({"x": np.linspace(0, 400), "y1": y1, "y2": y2})
source2 = ColumnDataSource(
    {"x": np.linspace(0, 400, num=49), "y1": np.diff(y1), "y2": np.diff(y2)}
)

plot1 = figure(width=300, height=300)
plot1.line("x", "y1", source=source1, line_width=2, line_alpha=0.6, line_color="black")
plot1.line(
    "x", "y2", source=source1, line_width=3, line_alpha=0.6, line_color="magenta"
)
plot2 = figure(width=300, height=300)
plot2.line("x", "y1", source=source2, line_width=2, line_alpha=0.6, line_color="black")
plot2.line(
    "x", "y2", source=source2, line_width=3, line_alpha=0.6, line_color="magenta"
)

max_val = Slider(value=16.0, start=-30, end=30.0, step=1.0, title="Max value")
slope = Slider(value=40, start=-60, end=60.0, step=1.0, title="Slope")
midpoint = Slider(value=200.0, start=-400, end=400.0, step=1.0, title="Midpoint")
offset = Slider(value=0, start=-30.0, end=30.0, step=1.0, title="Offset")


callback = CustomJS(
    args=dict(
        source1=source1,
        source2=source2,
        max_val=max_val,
        slope=slope,
        midpoint=midpoint,
        offset=offset,
    ),
    code="""
    const data = source1.data;
    const x_array = data['x'];
    const y1 = x_array.map((x) => {
        return 1 / (1 + Math.exp((x - midpoint.value) / -slope.value)) * max_val.value + offset.value
    })
    source1.data['y1'] = y1
    const y1_diff = y1.slice(1).map((value, index) => value - y1[index])
    source2.data['y1'] = y1_diff
    source1.change.emit();
    source2.change.emit();
""",
)

max_val.js_on_change("value", callback)
slope.js_on_change("value", callback)
midpoint.js_on_change("value", callback)
offset.js_on_change("value", callback)

show(
    column(
        column(max_val, offset, slope, midpoint),
        gridplot([[plot1, plot2]]),
    )
)

Logarithmic¶

The logarithmic curve is pretty much just the exponential decay with some alternative assumptions and a different formula. The logarithmic function seems to closely model how the measures of spike shape, such as the area under the curve, changes as the spike frequency increases. This equation is not often used in neuroscience but I will present it here since it is relevant to the current clamp data we collected. The basic equation is: $C*log(x)$ . $C$ is a scaling value and $x$ is the x position. The equation can further modified to shift is in the y direction: $C*log(x)+v$ . You can technically modify the equation to shift the x value however you cannot have a $log(0)$ . Shifting the x can improve curve fitting but has to be done carefully to avoid numerical issues with the log functions. I use the follow equation in the exmaple below:

v_{scale}*log_{e}(x-x_{shift})+offset

(4)

def log_func(x, vscale, offset=0, xshift=0):
    return vscale * np.log(x-xshift) + offset


x = np.arange(1, 301)
source = ColumnDataSource({"x": x, "y": log_func(x, vscale=5)})

plot = figure(width=400, height=400)

plot.line(x, log_func(x, vscale=5), line_width=2, line_color="black")
plot.line("x", "y", source=source, line_width=3, line_alpha=0.6, line_color="magenta")

yshift = Slider(start=-10, end=10, value=0, step=0.25, title="Y shift")
yscale = Slider(start=-10, end=10, value=5, step=0.5, title="Y scale")
length = Slider(start=1, end=600, value=301, step=0.25, title="Length")

callback = CustomJS(
    args=dict(
        source=source,
        yscale=yscale,
        length=length,
        yshift=yshift,
    ),
    code="""
    const len = Math.round(length.value)
    const t_length = Array.from({ length: len }, (_, i) => (1 + i))
    const y = t_length.map(x => {
        return (yscale.value * Math.log(x)+yshift.value)
    })
    source.data.x = t_length;
    source.data.y = y;
    source.change.emit();
""",
)

yshift.js_on_change("value", callback)
yscale.js_on_change("value", callback)
length.js_on_change("value", callback)

show(row(plot, column(yscale, length, yshift)))

Fitting curves in Python¶

This next section we will go over how to curve fit in Python. There are two ways to fit curves in Python: optimization-based curve fitting or polynomial curve fitting. If you want values that are interpretable you will want to use the the equations we have provided. If you just want to see if there are different curve shapes you can use the polynomial fit but you will have to run stats on all the coefficients since you will not have a 1 to 1 mapping of what coefficient represents what part of the curve shape. I recommend sticking with the optimization-based methods since it is easy to tweak to get resonable fits.

Curve fitting in Python can be done using the curve_fit function in the optimization module of Scipy. Curve fit has 3 main parameters you have to pass and two others that are very useful for getting good fits. You will need some function to pass that outputs an array of values. The function must accept an array of x values for the first parameter and have 1 or more parameters that are going to optimized such as def test(x, param1, param2, ...). You will also need a x array that is passed to the function and y array that the output of the function is compared to. The curve_fit function minimizes the sum of squares alternatively known as least squares. That means it does this sum((y-y_hat)**2) in Python code or in mathematic notation: $\sum_{i=1}^{n}({y_{i}-\hat{y}_{i}})^2$ ( $\hat{y}$ is the output of the function).

Since curve_fit is an optimization algorithm it works by finding the an minumum which is based on the derivative of the function. The more complex your function is, the harder it will be to fit your curve. This is because there are several minima that the algorithm could arrive at. There are three ways that you can improve the curve_fit function output.

Provided start values (p0).
Provided and upper (ub) and lower (lb) bounds. Some equations you must provide bounds to get a good fit since some equations cannot have zeros such as log or where you are trying to optimize a divisor (sigmoid function).
Set your x-values in a range that is acceptable for the equation. For logarithmic equation make sure your x-value is above 0. You can do this by adding 1e-10 to your x-values if they start at 0. If your x-values are very large you can subtract the mininum and add 1e-10.

Another factor to consider when curve fitting is what values special functions like log and exponential can take. Exponentials can easily overflow or go beyond the maximum value the computer can represent. So when you set your bounds and initial values make sure they are set so that you do not get any numerical issues. If you are working with very large or very small values it can be beneficial to transform your data by scaling (multiply/divide) or shifting (add/subtract) before curving fitting.

Scipy curve_fit outputs two numpy array. One is the parameters you are optimizing and one is the covariance matrix. The diagonal of the covariance matrix gives the variance of your parameters. You can use this to calculate the confidence intervals of your parameters which gives you an idea about the reliability of the fit. You can also use the covariance matrix to calculate the confidence intervals of your curve. You can also use the covariance matrix to calculate the confidence intervals of your parameters. This relies on some more advanced math knowledge (but is really simple code). We will calculate a simple version based on code from here: https://github.com/gjpelletier/delta_method. You essential need to calculate the derivative of your equation which in our case means calculate the partial derivative for each parameter and create something called the Jacobian matrix (a matrix of partial derivatives). To calculate the Jacobian matrix you do the following for each parameter in the order they occur:

Add a small value (1e-8) to the current parameter (i.e tau).
Get a new set of y values.
Subtract old y values from the new then divide by the difference between the new and old parameter.
Subtract a small value from the current parameter.
Get a new set of y values.
Subtract old y values from the new then divide by the difference between the new and old parameter.
Average the two sets of y values together and set them as column in your output matrix.

def jacobian(func, x, popt, h=1e-8):
    y_new = func(x, *popt)
    # calculate derivative gradients at x (change in func(x) per change in each popt)
    grad_new = np.empty((len(x), len(popt)))
    for i in range(len(popt)):
        # make a copy of popt
        popt2 = np.copy(popt)
        # gradient forward
        popt2[i] = (1 + h) * popt[i]
        y_new2 = func(x, *popt2)
        grad_up = (y_new2 - y_new) / (popt2[i] - popt[i])

        # gradient backward
        popt2[i] = (1 - h) * popt[i]
        y_new2 = func(x, *popt2)
        grad_dn = (y_new2 - y_new) / (popt2[i] - popt[i])

        # centered gradient is the average gradient forward and backward
        grad_new[:, i] = (grad_up + grad_dn) / 2
    return grad_new


def confidence_intervals(func, x, popt, pcov, alpha=0.05):
    # calculate variance in y_new due to each parameter and for all parameters combined
    jac = jacobian(func, x, popt)
    sigma = np.sqrt(np.sum(
        (jac @ pcov) * jac, axis=1
    )) # total variance from all popt values at each x
    # - - -
    # # lwr_conf and upr_conf are confidence intervals of the best-fit curve
    nobs = len(x)
    nparam = len(popt)
    df = nobs - nparam
    qt = stats.t.ppf(1 - alpha / 2, df)
    delta_f = sigma * qt
    y_new = func(x, *popt)
    lwr_conf = y_new - delta_f
    upr_conf = y_new + delta_f
    return lwr_conf, upr_conf


def param_ci(y, popt, pcov, alpha=0.05):
    n = len(y)
    p = len(popt)

    dof = max(0, n - p)
    tval = stats.t.ppf(1.0 - alpha / 2.0, dof)
    ci = np.sqrt(np.diag(pcov)) * tval
    return ci

Curve fit exponential decay¶

The exponential decay is fairly easy to curve fit. We are going to generate some synthetic data and then curve fit the data. There are three parameters that we will be fitting: amplitude, tau and yshift.

x = np.arange(300) / 10
y = exp_decay(x, amplitude=-15, tau=5.0, yshift=-1)
rng = np.random.default_rng(seed=42)
y = y + rng.uniform(size=300)

# initial parameters
my = np.min(y)
mx = x[y > (np.min(y) / np.exp(1))][0]
p0 = [my, mx, 0]
ub = [0, np.inf, np.inf]
lb = [-np.inf, 0, -np.inf]
bounds = [lb, ub]

popt, pcov = optimize.curve_fit(exp_decay, x, y, p0, bounds=bounds)

lower, upper = confidence_intervals(exp_decay, x, popt, pcov)
ci = param_ci(y, popt, pcov, alpha=0.05)

fig = figure()
fig.line(x, y, line_color="black")
fig.line(x, exp_decay(x, *popt), line_color="magenta", line_width=2)
fig.varea(x, lower, upper, color="orange", alpha=0.5)
print(f"Amplitude: -15 vs {popt[0]:.2f} \u00b1 {ci[0]:.2f}")
print(f"Tau: 5 vs {popt[1]:.2f} \u00b1 {ci[1]:.2f}")
print(f"Y offset: -1 vs {popt[2]:.2f} \u00b1 {ci[2]:.2f}")
show(fig)

Curve fit the double exponential decay¶

The double exponential decay is harder to curve fit. We are going to generate some synthetic data and then curve fit the data. You will notice the fit is not perfect and our values are a bit different. In this case the starting values (p0) make a large difference. While you cannot do this in the browser, I recommend running the code in the jupyter notebook and changing the input to the db_decay equations as well as the start values and bounds. You will see that is can get pretty hard to fit well is you cannot make a good first guess. There are four parameters that we will be fitting: amplitude 1, tau 1, amplitude 2, and tau 2.

x = np.arange(300) / 10
y = db_decay(x, -15, 5.0, -3, 2.5)
rng = np.random.default_rng(seed=42)
y = y + rng.uniform(size=300)

# initial parameters
my = np.min(y)
mx = x[y > (np.min(y) / np.exp(1))][0]
p0 = [my, mx, 0, 0]
ub = [0, np.inf, 0, np.inf]
lb = [-np.inf, 0, -np.inf, 0]
bounds = [lb, ub]

popt, pcov = optimize.curve_fit(db_decay, x, y, p0, bounds=bounds)

lower, upper = confidence_intervals(db_decay, x, popt, pcov)
ci = param_ci(y, popt, pcov)

fig = figure()
fig.line(x, y, line_color="black")
fig.line(x, db_decay(x, *popt), line_color="magenta", line_width=2)
fig.varea(x, lower, upper, color="orange", alpha=0.5)
print(f"Amplitude 1: -15 vs {popt[0]:.2f} \u00b1 {ci[0]:.2f}")
print(f"Tau 1: 5 vs {popt[1]:.2f} \u00b1 {ci[1]:.2f}")
print(f"Amplitude 2: -3 vs {popt[2]:.3f} \u00b1 {ci[0]:.2f}")
print(f"Tau 2: 2.5 vs {popt[3]:.3f} \u00b1 {ci[1]:.2f}")
show(fig)

Curve fit the sigmoid¶

Similar to the double exponential decay, I have found that the sigmoid curve fitting is very sensitive to start parameters. The sigmoid curve is also sensitive to how much of the curve you have. If you have a partial curve you may not get a good fit. The sigmoid function has four parameters we will be fitting: max_value, midpoint, slope, and offset.

x = np.linspace(0, 400)
y = sigmoid(x, 16.0, 200.0, 40.0, 0)
rng = np.random.default_rng(seed=42)
y = y + rng.uniform(size=50)

# initial parameters

p0 = [np.max(y), x[-1]/2, 15, 0]
ub = [np.inf, np.inf, np.inf, np.inf]
lb = [0, 0, 1e-8, -np.inf]
bounds = [lb, ub]

popt, pcov = optimize.curve_fit(sigmoid, x, y, p0, bounds=bounds)

lower, upper = confidence_intervals(sigmoid, x, popt, pcov)
ci = param_ci(y, popt, pcov)

fig = figure()
fig.line(x, y, line_color="black")
fig.line(x, sigmoid(x, *popt), line_color="magenta", line_width=2)
fig.varea(x, lower, upper, color="orange", alpha=0.5)
print(f"Max value: 16 vs {popt[0]:.2f} \u00b1 {ci[0]:.2f}")
print(f"Midpoint: 200 vs {popt[1]:.2f} \u00b1 {ci[1]:.2f}")
print(f"Slope: 40 vs {popt[2]:.3f} \u00b1 {ci[0]:.2f}")
print(f"Offset: 0 vs {popt[3]:.3f} \u00b1 {ci[1]:.2f}")
show(fig)

Curve fit the logarithmic curve¶

The logarithmic curve is fairly easy to fit but has one requirement if you are going to use the xshift value. The xshift value must be above 0. This example shows how to correctly set the bounds for a value that cannot be above a certain value.

x = np.arange(300,600)
y = log_func(x, 15, 200, xshift=299)
rng = np.random.default_rng(seed=42)
y = y + rng.uniform(size=300)*5

# # initial parameters
xmin= np.min(x)
xmax = np.max(x)
ymin = np.min(y)
ymax = np.max(y)
if np.abs(ymin) > np.abs(ymax):
    numerator = ymin-ymax
    offset_est = ymax
else:
    offset_est = ymin
    numerator = ymax-ymin
divisor = max((xmax - xmin), 1)
vscale_est = numerator / max(np.log(divisor), 1)
p0 = [vscale_est, offset_est, 0]
ub = [np.inf, np.inf, np.min(x)-1e-6]
lb = [-np.inf, -np.inf, -np.inf]
bounds = [lb, ub]

popt, pcov = optimize.curve_fit(log_func, x, y, p0, bounds=bounds)

lower, upper = confidence_intervals(log_func, x, popt, pcov)
# ci = param_ci(y, popt, pcov)

fig = figure()
fig.line(x,y, line_color="black")
fig.line(x, log_func(x, *popt), line_color="magenta", line_width=2)
fig.varea(x, lower, upper, color="orange", alpha=0.3)
print(f"Y scale: 15 vs {popt[0]:.2f} \u00b1 {ci[0]:.2f}")
print(f"Y offset: 200 vs {popt[1]:.2f} \u00b1 {ci[1]:.2f}")
show(fig)

Curve fitting equations with some variable and some fixed parameters¶

Occassionally you may want to curve fit an equation but fix one or two parameters. One way to do that is in the actual function definition. You can set a default value def func(x, a, b, c=0). Since default arguments have to be set after positional arguments the curve fit function does not even see them. Another way is to pass a lambda function and set defaults for specific arguments such as lambda x, a, b: log_func(x, a, b, c=10). You can also define a new function or use Python’s partial functions (very similar to lambda functions).

Curve fitting with constraints on parameters¶

Scipy curve_fit allows you to specify lower and upper bounds for parameters but does not allow you to ensure that one parameter is large than the other such as if we want to fit a fast and slow to decay. You can transform your parameters to ensure that they meet some constraint. For example you could multiply one value by some constant that is also fit. If you want your fast decay to also be less than your slow decay you can fast = slow * multiplier then ensure that multiplier is between 0 and 1 using the bounds argument so that the fast decay is always some fraction of the slow decay. This will make your equation harder to fit but can also make the output more interpretable since fast decay will always be less than the slow decay.

Dealing with outliers¶

Since Scipy’s curve_fit is based on least squares minimization it sensitive to outliers. If you provide bounds, you can pass a keyword parameter loss= and provide one of the following loss functions: soft_l1, huber, cauchy, or arctan. It is recommended to start with soft_l1. You can provide your own loss function as well. These functions weight the squared values ((y-y_hat)**2) before summing them. You can see least_square for more information on loss functions.

Advanced curve fitting¶

If you need more advanced curve fitting then a good package to checkout is lmfit. This package is built on top of Scipy’s curve_fit but offers a lot of flexibility in terms of fixed or variable parameters, complex equations and variability estimates of parameters.

Chapters

Distributions

Chapters

Synthesizing data (IN PROGRESS)