Matlab implementation details

Key steps explained briefly

Import

Import your data with a subfunction

[CUSTOM, CUSTOM_HEADERS, CUSTOM_META] = import_custom_data(path_Data);

Not the most general structure field names, so you need to change these if you want to play with your own data, and add the third column if your data for example comes with variance / standard deviation

CUSTOM{i}.lambda = data(:,1);
CUSTOM{i}.melatonin = data(:,2);

Parameters

The model fitting can be looped through these, so that you can do sensitivity analysis on how for example different normalization, weighing and transformation methods effect the model fits. For example LOGARITHMIC fit would need to be defined (more natural choice for photoreception), as current version only works for linear responses.

% Parameters
normalize_method = {'nonneg_maxnorm'}; % raw
model_strings = {'simple'; 'opponent_(L-M)'; 'opponent_(M+L)-S'; 'opponent_(+L-M)-S'}; % {'opponent_(+L-M)-S'};
error_for_fit_string = 'variance_relative'; % 'variance_relative'; % ; % 'variance_relative';
fit_domain = 'lin';

For example the logarithmic fitting emphasized by Russell G. Foster at Velux Daylighting Symposium in 2011

And how a simple nomogram fits differently to same data in LOG and in LIN domain:

Normalization

The default normalization to use the full range between 0 and 1, not much work put on this part, and you can add some fancy stuff here if you want.

elseif strcmp(normalize_method, 'nonneg_maxnorm')
        normStr = 'Non-negative_normToUnity';
        
        min_v = nanmin(y);
        y = y - min_v;
        max_v = nanmax(y);
        y = y / max_v;

FIT

This part is a bit patchy with too many levels of wrappers

Basically fit_model_to_melatonin_wrapper calls fit_QuickPooling_simple_v2018, which calls again = fit_mixedModelsForMelatonin=, that finally calls the “main” pooling model poolingModel_main (these could be combined later)

MAIN FUNCTION

Here we define first, the coefficients to be optimized:

% initial values
m0      = contr(1);
c0      = contr(2);
r0      = contr(3);
k1      = comb_k(1);
k2      = comb_k(2);
inhMWS  = oppon(4);
fB0     = oppon(1);
fD0     = oppon(2);
fE0     = oppon(3);                
dens_M  = densit(1);
dens_C  = densit(2);
dens_CS = densit(3);
dens_R  = densit(4);      

% Parameter that can be changed during optimization
x0 = [m0; c0; r0; k1; k2; inhMWS; fB0; fD0; fE0; dens_R; dens_C; dens_CS; dens_M];
x0_names = {'m0'; 'c0'; 'r0'; 'k1'; 'k2'; 'inhMWS'; 'fB0'; 'fD0'; 'fE0'; 'dens_R'; 'dens_C'; 'dens_CS'; 'dens_M'};

It is easier to structure the code so that all these are pushed to the optimizer but you can later set upper bound (ub) and lower bound (lb) to be the same the constrain that coefficient to be fixed

For example it is easier to keep the optical densities the same to limit the free parameters and reduce overfitting while physiologically speaking the pigment will bleached during light exposure reducing the optical density (and making the nomogram narrower compared to high optical density)

% MANIPULATE THESE if you want to CONSTRAIN some variables
% We are defining the free parameters for AIC, from these
% automatically, so we do not want to keep the opponent model
% parameters non-fixed
if strcmp(mode, 'simple') % no opponency
    lb = [0.0; 0.03; 0.1; 1.0; 10.0; 1; 1.5; 1.5; 1.25; 0.40; 0.38; 0.30; dens_M]; % lower bounds for x0 variables
    ub = [1.5; 1.12; 1.5; 1.0; 10.0; 1; 1.5; 1.5; 1.25; 0.40; 0.38; 0.30; dens_M]; % upper bounds for x0 variables                          
elseif strcmp(mode, 'opponent_(L-M)') % Kurtenbach et al. (1999) 
    lb = [0.0; 0.03; 0.1; 1.0; 10.0; 1; 1.5; 0.0; 0.00; 0.40; 0.38; 0.30; dens_M]; % lower bounds for x0 variables
    ub = [1.5; 1.12; 1.5; 1.0; 10.0; 1; 1.5; 1.5; 1.25; 0.40; 0.38; 0.30; dens_M]; % upper bounds for x0 variables                          
elseif strcmp(mode, 'opponent_(M+L)-S') % Spitschan et al. (2014)
    lb = [0.0; 0.03; 0.1; 1.0; 10.0; 1; 0; 0.0; 0.00; 0.40; 0.38; 0.30; dens_M]; % lower bounds for x0 variables
    ub = [1.5; 1.12; 1.5; 1.0; 10.0; 1; 1.5; 1.5; 1.25; 0.40; 0.38; 0.30; dens_M]; % upper bounds for x0 variables                          
elseif strcmp(mode, 'opponent_(+L-M)-S') %  Woelders et al. (2018)
    lb = [0.0; 0.03; 0.1; 1.0; 10.0; 0; 0.0; 0.0; 0.00; 0.40; 0.38; 0.30; dens_M]; % lower bounds for x0 variables
    ub = [1.5; 1.12; 1.5; 1.0; 10.0; 1.5; 1.5; 1.5; 1.25; 0.40; 0.38; 0.30; dens_M]; % upper bounds for x0 variables                          
end

Cost function

The actual function that we are try to optimize is the defined in the poolingModel_function.m, that in practice generates at each iteration of the optimizer (e.g. fmincon), the composite function from the action spectra (“basis functions” in a way)

% define the action spectra used in the optimization procedure
callFrom = 'poolingModel';            
[peak, templates] = poolingModel_defineNeededSpectra(mode, group, linLog, options);
xLims = peak{1}.nomogInputs{7};
xRes = peak{1}.nomogInputs{6};
lambda_nomo = (xLims(1):xRes:xLims(2))';
actSpectra = define_actionSpectra(lambda_nomo, peak, group, templates, callFrom);

% TODO! You could do debug plot here to make sure that
% actSpectra are indeed in LOG/LINEAR with OCULAR MEDIA
% correction

f = @(x) poolingModel_function(x, lambda, y, weights_for_fit, mode, statParam, output, actSpectra, options);

And then you might get better behavior with trying different algorithms for the optimization instead of interior-point used by default

optimOpt = optimset(optimOpt, 'Algorithm', 'interior-point');
[x, fval, exitflag, output_struct] = fmincon(f,x0,A,b,Aeq,beq,lb,ub,nonlcon,optimOpt);

You should not here that it is hard to get confidence intervals out of fmincon, so you could optimize this actually using lsqcurvefit

How to get ‘confidence interval’ with fmincon optimization:

TODO!
% The fmincon function does not return covariance matrices on the parameters it optimises
% so you cannot calculate confidence intervals on those estimates
% see https://www.mathworks.com/help/optim/ug/lsqcurvefit.html

And then in calc_fitStats() you can define the final metric used for optimization if you are not happy with the current choice of sum of squared errors that are either weighed or not with inverse of variance.

% Value to be minized
out = SS_err_norm;

Models

All the opponent models are defined in same manner initially in poolingModel_function.m

%% COMBINATION MODELS
    
if strcmp(mode, 'simple') == 1
    % Combine the 3 above defined terms - ORIGINAL VERSION
    Sfit = ( OPN4 + ( (C + R).^(1/x(4)) ).^x(5)  ) .^(1/x(5));                         
    
elseif strcmp(mode, 'opponent_(L-M)') == 1 % from Kurtenbach et al. (1999)   
    % add spectral opponency and S-cone contribution
    Sfit = ( OPN4 + ( (C + R + Opp).^(1/x(4)) ).^x(5)  ) .^(1/x(5));                  
    
elseif strcmp(mode, 'opponent_(M+L)-S') == 1 % Spitschan et al. (2014)
    Sfit = ( OPN4 + ( (C + R + Opp).^(1/x(4)) ).^x(5)  ) .^(1/x(5));
    
elseif strcmp(mode, 'opponent_(+L-M)-S') == 1 % Woelders et al. (2018)
    Sfit = ( OPN4 + ( (C + R + Opp).^(1/x(4)) ).^x(5)  ) .^(1/x(5));

With the Opp term being naturally slightly different. But if you feel like defining the terms different, you can define additional cases (elseif) here and test “easily” how does that change the fit result:

% Opponent term, from  Spitschan et al. (2014)
% https://dx.doi.org/10.1073/pnas.1400942111
% LWS+MWS - x*SWS
if strcmp(mode, 'opponent_(M+L)-S')
    Opp = (x(8) .* abs(x(9).*(LWS+MWS) - (x(7) .* SWS))) .^ x(4);
    
% Opponent term, from  Woelders et al. (2018)
% https://doi.org/10.1073/pnas.1716281115
% L - x1*M - x2*S
elseif strcmp(mode, 'opponent_(+L-M)-S')
    Opp = (x(8) .* abs(x(9).*LWS - (x(6) .* MWS) - (x(7) .* SWS))) .^ x(4);

% Opponent term, from Kurtenbach et al. (1999) 
% http://dx.doi.org/10.1364/JOSAA.16.001541
% LWS - x*MWS    
else % strcmp(mode, 'opponent_(L-M)')
    Opp = (x(8) .* abs(LWS - (x(9) .* MWS))) .^ x(4);
end

Note now that the number of free parameters (K in calc_fitStats.m) is defined by the function defineNoFreeParameters.m simply comparing the number of parameters in the ub and lb that are not the same.

See the excellent discussion in McDougal and Gamlin, 2010 about the coefficients and trying to limit the number of free parameters. Note that the melatonin suppression data from Brainard et al. 2001 “only” have 9 data points which is a huge number for spectral study of melatonin suppression, but very limited in terms of having a good estimate of the underlying photoreceptor contributions (compare for example to the raw microspectrometric data used by Govardovskii et al. 2000) for their nomogram template fitting. In other words, you might be very easily just overfitting the data.

Now for example the model implementation for Woelders et al. 2018 that performs the best qualitatively for melatonin suppression data has 7 free parameters using the exponent k1 and k2 values set fixed by McDougal and Gamlin, 2010.

Later in the R visualization part of this repository, the Akaike Information Criterion (AIC) should penalize (like does the Bayesian Information Criterion, BIC) for the “excessive use” of free parameters and aid you in selecting the best model. See the file plot_subfunctions.R for details under the function `recompute.fit.stats`

recompute.fit.stats = function(stat_in, n, K, y_in, y_fit, error, w)
  # We need to compute the BIC, AIC "by hand" as R has many functions if we would
  # have created the model in R, but now our model was defined in Matlab
  # n = number of data points (observations)
  # K = was the number of free parameters
  # w = weights used when fitting the model

  df.ll = K + 1

  # BIC
  bic = -2 * ll + log(n) * df.ll
  
  # AIC
  aic = -2 * ll + 2 * df.ll

Output

Outputs are wrapped inside a structure that can be then plotted or saved to a .csv file

fit_out.spec = spec;
fit_out.points = points;
fit_out.stats_out = stats_out;
fit_out.actSpectra = actSpectra;

fit_out.final_x = final_x;
fit_out.x0_names = x0_names;
fit_out.fval = fval;
fit_out.output_struct = output_struct;
fit_out.statParam = statParam;
fit_out.fit_stats = fit_stats;

Visualize

Fits are then visualized in visualize_pooling_models_for_CUSTOM() and the data points are saved in .csv file that are then visualized better using R and ggplot2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly