mersenneforum.org Bunghole linear regression
 Register FAQ Search Today's Posts Mark Forums Read

 2014-03-27, 11:27 #1 henryzz Just call me Henry     "David" Sep 2007 Liverpool (GMT/BST) 136008 Posts Bunghole linear regression I am trying to fit a linear regression to the borehole function using latin hypercube sampling. http://www.sfu.ca/~ssurjano/borehole.html Outside of a narrow range of points(46-252) I get warnings about rank deficiency when I run the regress function with upto quadratic interactions. Lower than 46 obviously won't work as there are 45 predictors. Why am I having trouble with more than 252? The code I am using: Code: tNorm= @(mu, sigma, L, U, N, D) norminv((1-D)*normcdf(L,mu,sigma)+D*normcdf(U,mu,sigma),mu,sigma); n=2^8-4; X=ones(n,1); X(:,2:9)=lhsdesign(n,8); X(:,2)=tNorm(0.10,0.0161812,0.05,0.15,n,X(:,2)); X(:,3)=exp(tNorm(7.71,1.0056,log(100),log(50000),n,X(:,3))); X(:,4)=X(:,4)*(115600-63070)+63070; X(:,5)=X(:,5)*(1110-990)+990; X(:,6)=X(:,6)*(116-63.10)+63.10; X(:,7)=X(:,7)*(820-700)+700; X(:,8)=X(:,8)*(1680-1120)+1120; X(:,9)=X(:,9)*(12045-9855)+9855; Y=(2*pi*X(:,4).*(X(:,5)-X(:,7)))./(log(X(:,3)./X(:,2)).*(1+(2*X(:,8).*X(:,4))./(log(X(:,3)./X(:,2)).*X(:,2).^2.*X(:,9)) + X(:,4)./X(:,6))); col=size(X,2)+1; for i=2:9 for j=i:9 X(:,col)=X(:,i).*X(:,j); col=col+1; %X=cat(2,X,X(:,i).*X(:,j)); end end lm=regress(Y,X); residuals=abs(Y-(lm'*(X'))'); rss=sum(residuals.^2); I am using matlab 2008.
 2014-03-30, 14:53 #2 potonono     Jun 2005 USA, IL 193 Posts Are the columns of X linearly dependent when n is over 252? (which would cause regress to start setting as many variables to zero as necessary to attempt resolution of the rank deficiency)
 2014-03-30, 17:21 #3 henryzz Just call me Henry     "David" Sep 2007 Liverpool (GMT/BST) 178016 Posts That is what everything points to. Even when I make all the predictors a uniform distribution I get this problem. I get the same problem when using a function to produce LPTAU sampling(with the same range working). If they have the same properties like this I would be surprised.
 2014-04-11, 15:10 #4 henryzz Just call me Henry     "David" Sep 2007 Liverpool (GMT/BST) 27×47 Posts I think I have got to the bottom of this. Due to it being a polynomial model I had introduced multicollinearity. https://www.google.co.uk/search?q=mu...PM6n8gP8qIDoBQ For some reason that still didn't solve it completely. r still had high multicollinearity with its square and its interactions with other variables. I have started fitting the logarithm of r instead which has removed the problem. I am not certain whether that was the correct course of action though.

 Similar Threads Thread Thread Starter Forum Replies Last Post cubaq YAFU 2 2017-04-02 11:35 wombatman Msieve 2 2013-10-09 15:54 Random Poster Math 2 2010-07-18 22:31 CRGreathouse Msieve 8 2009-08-05 07:25 10metreh Msieve 3 2009-02-02 08:34

All times are UTC. The time now is 08:48.

Sat Dec 10 08:48:42 UTC 2022 up 114 days, 6:17, 0 users, load averages: 0.67, 0.93, 0.89