function [ns,co,sp] = confidence(x, clevel, track) % [s,c,s] = confidence(x, clevel) % % samplesize = confidence(x,0.95); % [dummy,conf] = confidence(x,Inf); % % Calculates confidence level (as in Pyle's book) for one numerical % variable. Infs and NaNs are discarded from x. % % Input arguments ([]'s are optional): % x (vector) the values of the variable % [clevel] (scalar) required confidence level. % If >=1, the confidence level using % all values is calculated. If missing, NaN or % empty, default value of 0.95 is used. Also % in the >=1 case the confidence level is 0.95 % but the algorithm is run through the whole data set. % [track] (scalar) tracking level, 0 for nothing (the default), % any other value to see development of % confidence level and sample histogram. % % Output arguments: % n (scalar) The number of samples. % c (scalar) The attained level of confidence. % s (scalar) Variable sparcity. % Based on Pyle's CD-ROM (Convergence.c), juuso 15101999 %% input arguments if nargin<2 | isnan(clevel) | isempty(clevel), clevel = 0.95; end if clevel>=1, clevel = 0.95; totalconf = 1; else totalconf = 0; end if nargin<3, track = 0; end %% code strings with numbers, handle empty variables, sparcity totlen = length(x); x = x(isfinite(x)); len = length(x); sparcity = 1-len/totlen; %% required number of tests ntests = ceil(log(1-clevel)/log(clevel)); sample_order = randperm(len); sample_x = []; sample_std = []; if track, cvc_vals = []; end nsamples = 0; ok = 0; while ~ok, nsamples = nsamples + 1; sample_x = x(sample_order(1:nsamples)); std_x = std(sample_x); if std_x == 0, % all values are the same cvc = 1-clevel^nsamples; else std_x = std(sample_x); sample_std(end+1) = std_x; std_std = std(sample_std); if length(sample_std) >= ntests, % at least ntests std samples required s = sample_std(end-ntests+1:end); r = max(s)-min(s); k = 0.5*r/std_std; cvc = 2*normcdf(-k); else cvc = 0; end end ok = ( (~totalconf & cvc >= clevel) | nsamples==len); if track, cvc_vals(end+1) = cvc; if rem(nsamples,100)==0 | ok, l = length(cvc_vals); subplot(1,2,1), plot(1:l,cvc_vals,'b-',[1 l],[clevel clevel],'r-') title('Confidence level'); subplot(1,2,2), hist(sample_x,50); title('Sample'); drawnow end end end ns = nsamples; co = cvc; sp = sparcity; return;