
 
 
Research Article Volume 5 Issue 5
     
 
	Response surface methodology and its applications in agricultural and food sciences
 Andre I Khuri  
    
 
   
    
    
  
    
    
   
      
      
        
        Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.
        
        
 
 
 
          
     
    
    
    
    
    
        
        
       
     
   
 
    
    
  
Department of Statistics, University of Florida, USA
Correspondence: Andre I Khuri, Department of Statistics, University of Florida, USA
Received: September 15, 2016 | Published: April 11, 2017
Citation: Khuri AI. Response surface methodology and its applications in agricultural and food sciences. Biom Biostat Int J. 2017;5(5):155-163. DOI: 10.15406/bbij.2017.05.00141
 Download PDF
        
       
   
Abstract
  The purpose of this article is to  provide an overview of response surface methodology (RSM) which includes the  modeling of a response function, the corresponding choice of design, and the  determination of optimum conditions. In addition, the use of RSM in  agricultural and food sciences is highlighted by citing several examples taken  from a variety of applied journals.
 
Introduction
  Response surface methodology (RSM)  consists of a group of mathematical and statistical techniques concerned with
  
    - The selection and construction of an appropriate design that can  provide adequate and reliable information concerning a certain Response  variable, denoted by 
.
 
    - The determination of a suitable model  that best fits the data that can be generated from using the design chosen in  (1). Such a model gives an approximate functional relationship between the  response variable  
   and a set of Control variables believed (by the experimenter) to have  an effect on the response   
  . These variables are denoted by  
   .
 
    - The finding of optimal settings on the  control variables that produce maximum (or minimum) response values within a  certain region of interest.
 
  
  From the historical point of  view, the article by Box and Wilson1 is  considered to be the first for having laid the foundations for RSM. Some of the  early papers that also contributed to the development of RSM include those by Box  and Hunter2 and Box and Draper.3 Several review papers on RSM were subsequently  published starting with the article by Hill and Hunter4 that emphasized practical applications of RSM in the chemical and processing fields.  This was followed by more recent reviews by Myers et al.5–7 A review of RSM from a biometric viewpoint was  given by8 who emphasized biological  applications rather than applications in the physical and engineering sciences.  In addition, several books were written on the subject of RSM by Myers, Khuri  and Cornell, Box and Draper and Myers et al.9–12
  In the early development of RSM,  from 1951 up through the 1970s, the main focus of attention was on controlled  experiments that can, for the most part, be performed in a laboratory. This was  particularly suited in an industrial setting with possible applications in the  physical and engineering sciences. The review paper by Hill and Hunter4 made reference to several applications in the  chemical industry. It should be pointed out here that the seminal work of Box  and Wilson1 occurred at a major chemical  company, and Box himself was initially trained as a chemist in England.
  One of the basic characteristics  of a response surface investigation is its sequential nature whereby  experiments are performed in stages. Information acquired from one set of  experiments is used to plan the strategy for a follow-up set of experiments.  This sequential pattern of experimentation was suggested by Box and Youle.13 Such an approach works well in an industrial  setting since the response values in a given stage can be obtained in a  relatively short time. However, this approach may not be feasible in an  agricultural setting where experiments typically require long periods between  stages. Furthermore, it is quite common in an agricultural experiment that the  results of a single experiment may be the only ones available rather than  having a series of experiments obtained sequentially. Another main difference  between an industrial experiment and an agricultural one has to do with  factors' levels. In an industrial experiment, the levels of quantitative  factors can be accurately controlled and measured. This, however, may be difficult  to do in an agricultural experiment. Edmondson14 Outlined several other major distinctions between industrial and agricultural  experiments.
  Mead and Pike8 were the first  to bring attention to the role of RSM in agriculture and biometric research in  general. They also provided a survey of bio- logical and agricultural journals  that used response surface methods. However, they emphasized the use of  nonlinear models to describe the behavior of biological data. Such models are  usually referred to as Mechanistic. The traditional modeling scheme used  in the early work on RSM was based on the so-called Empirical modeling where low-degree  polynomials are used to fit the response data. These polynomials are chosen and  tested on the basis of the observed data, but are not selected on the basis of  information that pertain to certain chemical of physical laws, as is the case  with mechanistic models.
  While certain approaches used in  RSM may not be quite suitable in an agricultural setting, there is a lot to be  gained from using certain response surface techniques in an agricultural  experiment. The purpose of this chapter is to provide an expose of some basic  methods and models used in RSM. Several applications of RSM in agricultural and  food sciences will also be mentioned using examples taken from the  corresponding literature.
 
Response surface models
  Let
 be a response variable of interest, and let   
   denote control variables believed to have an effect on   
  . As was mentioned earlier, one of the objectives of RSM  is to establish a functional relationship between   
    and its  associated control variables. Such a relationship is, in general, unknown, but  can be approximated by a low-degree polynomial model of the form  
   , where    
     
   is a vector function  of p elements that consists of powers and cross products of powers of  
     up to a certain degree  denoted 
        
   is a vector of  p unknown constant coefficients referred to as parameters, and _ is a random experimental  error assumed to have a zero mean. If model (2.1) provides an adequate  representation of the response, then the quantity   
   represents the so-called Mean response, or the Expected Value of   
  , and is denoted by  
   .
  Two models are commonly used in  RSM; they are the first-degree model (d = 1),
    ;              (2.2)
    and the second-degree model (d =  2),
   ;           (2.3) 
  Models (2.2) and (2.3) are special  cases of model (2.1). The first model is usually used in the initial phase of  the experiment representing part of an exploratory process to assess the  important factors. The second model is subsequently developed after a series of  experiments have been performed resulting in the identification of the important  factors to be considered in the experiment. At this stage, the experimenter will  be ready to use the model in doing data analysis to determine significance of  the model's parameters, estimate the mean response, and to arrive at optimum  operating conditions on the control variables that result in a maximum or a  minimum response over a certain region of interest which we denote by  
   .
  In order to estimate the unknown  parameters in model (2.1), a series on   
    experiments are  performed in each of which the response y is measured for specified settings of  the control variables. The totality of these settings constitutes the so called  response surface design, or just design, which can be represented by a matrix, denoted  by   
  , of order    
  called the design matrix,
    ;(2.4) 
    Where   
   denotes the    
   design setting  of   
   . Each row of   
    represents a  point, referred to as a design point, in a k-dimensional   
    clidean space.  The response value obtained at the   
    setting of    
  , namely       
   
  , is denoted by  
   . From (2.1) we thus have
    ;
(2.5)
  Where   
     denotes the error term at the  
   experimental run. Model (2.5) can be expressed in  matrix form as 
    ;(2.6)
    Where    
  0,  
     is a matrix of  order  
    whose 
      row is 
    , and   
   . Note that the first column of   
    is the column  of ones,  
    .
  A common method for estimating  the parameter vector in (2.6) is the one based on the method of ordinary least  squares. This method requires that has a  zero mean and a variance-covariance matrix given by  
     [see, for example, Khuri and Cornell (Section 2.3)].10  In this case, the least-squares estimator of    
 , denoted by   
  , is given by
    ; (2.7)
    The variance-covariance matrix of   
    is then of the form
  
:        (2.8)
    An estimate   
  , of the mean response at  
   , is obtained by replacing   
    by  
   , that is,
 
 
 
  :       (2.9)
    The quantity   
   also gives the value of the predicted response,  
   , at the  
     design point 
 
   . In general, at any point, 
    , in the region  
    , the predicted response  
     is
   
 : (2.10)
    Using formula (2.8), the variance  of  
    is then given by [Khuri and Cornell (1996, Section  6.1) or Khuri (2009, Section 12.4)] 
    : (2.11)
    The expression on the right-hand  side of (2.11) is called the Prediction variance. This is an important quantity  since the quality of prediction depends on the size of this variance. Also, the  determination of optimum operating conditions on the control Variables requires  that the prediction variance be small over the region of interest,  
   , in order to arrive at reliable information about the  true optimum of the response over    
  . This is of course dependent on the assumption that  the postulated model in (2.1) does not suffer from Lack  of fit (LOF). For a study of LOF of a fitted response surface model,  see, for example, Khuri and Cornell (Section  2.6).10
 
Response surface designs
  The choice of the design matrix,
, is quite important in any response  surface investigation since the prediction variance depends on    
   as can be seen from formula (2.11). Some  common design properties are
  
    - Orthogonality
 
  
  A design   
    is said to be orthogonal if the matrix    
   is diagonal, where   
  is the model matrix in (2.6). In this  case, the elements of  
    will  be uncorrelated since the off-diagonal elements of    
   in (2.8) will be zero. If the error  vector   
  in (2.6) is assumed to be normally  distributed as  
   , then these elements will be also  stochastically independent. This makes it easier to test the significance of  the unknown parameters in the model.
  
    - Rotatability
 
  
  A design   
  is said to be rotatable if the  prediction variance in (2.11) is constant at all points that are equidistant  from the design center, which, by a proper coding of the control variables, can  be chosen to be the point at the origin of the    
  -dimensional coordinates system. It  follows that   
   is constant at all points that fall on  the surface of a hyper sphere centered at the origin, if the design is  rotatable. This causes the prediction variance to remain unchanged under any  rotation of the coordinate axes. In addition, if optimization of   
   is desired on concentric hyper spheres,  as in the application of ridge analysis, which will be discussed later, then it  would be desirable for the design to be rotatable. This makes it easier to  compare the values of  
    on a given hyper sphere as all such  values will have the same variance.
  The necessary and sufficient condition  for a design to be rotatable was given by Box and Hunter.2 See also Appendix 2B in15,16 introduced a measure that quantities the amount of rotatability in a response  surface design. This measure can be helpful in comparing designs on the basis  of rotatability, assessing the extent of departure from rotatability, and in  improving readability by a proper augmentation of a non rotatable design.
  
    - Optimal designs
 
  
  Optimal designs are those that are  constructed on the basis of a certain optimality criterion that pertains to the  'closeness' of the predicted response,   
  , to the mean response,  
    over a certain region of interest,   
   . The design criteria that address the minimization  of the variance associated with the estimation of model (2.1)'s unknown  parameters are called Variance-related criteria. The most prominent of  such criteria is the D-Optimality criterion which  maximizes the determinant of the matrix   
  . This amounts to the minimization of  the size of the confidence region on the vector  
   in model (2.6). Another  variance-related criterion that is closely related to D-optimality is the G-Optimality criterion which requires the minimization  of the maximum over   
    of  the prediction variance in (2.11).
  Such variance-related criteria are  often referred to as Alphabetic optimality. They are meaningful when  the fitted model in (2.1) represents the true relation- ship connecting   
    to its control variables. There are, however,  many situations Where this is not the case due to fitting the "wrong  model". This results in the So-called Model bias.  For example, a first-degree model may be fitted to a data set when in reality a  second-degree model would be a better representative of the response data. Box  and Draper3 placed a great emphasis on the  role of bias in the choice of a response surface design and advocated choosing  designs that minimized model bias. They considered the minimization of bias to  be a very important design criterion, and in certain cases, even more important  than the variance-related criteria.
  Designs for first-degree models
  Designs for fitting first-degree models  as in (2.2) are called first-order designs. The most common of such designs are  the   
  factorial (   
   is the number of control variables), and the Plackett-Burman design.
  The 2k Factor
ial Design: In a       factorial design, each control variable is  measured at two levels, which can be coded to take the values, -1, 1, that  correspond to the so-called low and high levels, respectively, of each  variable. This design consists of all possible combinations of such levels of  the   
   factors. Thus, each row of the design  matrix   
   in (2.4) consists of all 1's, all -1's,  or a combination thereof. It therefore represents a particular treatment  combination. In this case, the number,    
  , of experimental runs is equal to   
    provided that no single design point is  replicated more than once. For example, in an agricultural experiment, two  levels of fertilizer A are combined with two levels of fertilizer B in order to  study their effects on the yield of a certain vegetable crop over a certain  period of time. This results in a   
   factorial experiment with four  treatment combinations.
 
  If    
  is large   
  , the  
     design requires a large number of design  points. In this case, fractions of   
    can be considered. For example, we can  consider a one-half- fraction design which consists of one-half the number of points  of a   
    design, or a one-fourth-fraction design which  consists of one- fourth the number of points of a    
   design. In general, a   
    fraction of a  
     design consists of   
    points from a full   
    design. Here, m is a positive integer such  that    
   so that all the    
   parameters in model (2.2) can be estimated.  The construction of fractions of a   
    design is carried out in a particular manner,  a description of which can be found in several experimental design textbooks,  such as Box & Raktoe, et al.17–19 See  also Chapter 3 in Khuri and Cornell.15
  The plackett-burman design: The Plackett-Burman design allows two levels for each of  the 
    
      control variables, just like a    
  design, but requires a much smaller  number of experimental runs, especially if    
  is large. It is therefore more economical than  the    
   design. Its number,   
  , of design points is equal to  
    , which is the same as the number of  parameters in model (2.2). In this respect, the design is said to be saturated  because its number of design points is equal to the number of parameters to be  estimated in the model. Furthermore, this design is available only when n is a  multiple of 4. Therefore, it can be used when the number,   
   , of control variables is equal to 3,  7, 11, 15, ....
  To construct a Plackett-Burman design  in   
    variables, a row is first selected whose elements  are equal to -1 or 1 such that the number of 1's is   
   and the number of -1's is   
   . The next    
   rows are generated from the first row  by shifting it cyclically one place to the right  
    times. Then, a row of negative ones is  added at the bottom of the design. For example, for    
  , the design matrix,  
    , has 8 points whose coordinates are   
   , and is of the form 
  
  : 
  Design arrangements for   
   factors can be found in Plackett –Burman  (1946).
  Designs for second-degree models
  These are designs for fitting  second-degree models as in (2.3), which has  
    parameters (they are also referred to  as second-order designs). The number of distinct design points of such design  must therefore be at least equal to p. The design settings are sometimes coded  so that   
   and 
  
  ,   
   , where    
   is the number of experimental runs and   
    is the  
     setting of the   
  control variable   
   .
  The most frequently-used second-order  designs are the   
    factorial, central  com- posite, and the Box-Behnken designs.
  The   
Factorial Design: The       factorial  design consists of all the combinations of the levels of the   
   control variables which have three  levels each. If the levels are equally spaced, then they can be coded so that  they correspond to -1, 0, 1. For example, for   
   , the design matrix, in coded form,  consists of 9 points as shown below:
  
   
:
    The number of design points for a  
     design is    
  , which can be very large for a large   
   . The use of full factorial designs is  therefore not feasible if the number of experimental units is limited.  Fractions of a    
   design can be considered to reduce the cost of running such an  experiment. A general procedure for constructing fractions of    
  is described, for example, in.19 See also McLean and Anderson.20
The central composite design: The central composite design (CCD) is perhaps the most  popular of all second-order designs. It was first introduced in1 as an alternative to the      design. This  design consists of the following three parts:
    -  A complete (or a fraction of)   
 factorial design whose factors' levels are  coded as -1, 1. This is called the factorial portion of the design.
 
    -  An axial portion consisting of   
    points arranged so that two points are chosen  on the axis of each control variable at a distance of    
   from the design center (chosen as the point at  the origin of the coordinates system). We refer to    
   as the axial parameter.
 
    -  A certain number,  
    , of replications at the design center     
 . This is called the center-point portion.
 
  
   Thus, the total number of design points in a  CCD is   
   . For example, a CCD for 
      
    has the form
    : 
    We note that the CCD is obtained by  augmenting a first-order design, namely, the and then   
   center-point replications. This design is  usually developed in a manner consistent with the sequential nature of a  response surface investigation in starting with a first-   
    factorial with additional experimental runs, namely, the   
    axial points order  design, to fit a first-degree model, followed by the addition of design points  to _t the larger second-degree model. The first-order design serves in a  preliminary phase to get initial information about the response system and to  assess the importance of the factors in a given experiment. The additional  experimental runs are chosen for the purpose of estimating all the   
   parameters in model (2.3). The fitted  model is then used in the determination of optimum operating conditions on the  control variables over the region of experimentation.
  When    
   is large   
   , the factorial portion can be  replaced by a fraction of a     
  design. For example, for    
  , a one-half fraction of   
    can be used giving a total of 16 points in the  factorial portion instead of 32 (for more details about fractionating in the  factorial portion, see Khuri and Cornell, 1996, Section 4.5.3).
  A CCD can become rotatable by  assigning the value   
   to the axial parameter,   
   , where   
   denotes the number of points in its  factorial portion, that is,   
   . In addition, the number of  center-point replications,    
  , can be chosen so that a rotatable  CCD will have the additional property of orthogonality (see Section 3). Note  that orthogonality of a second-order design is attainable only after expressing  model (2.3) in terms of orthogonal polynomials as explained in.2 See also.15 In  particular, Table 4.3 in Khuri and Cornell's book can be used to determine the  value of    
  in order for a rotatable CCD to have  the additional orthogonality property.
  The box-behnken design: This design was  developed by Box GEP et al.21 It provides  three levels for each factor and consists of a particular subset of the  factorial combinations from the     factorial design. The actual  construction of such a design is described in the three RSM books,11 Khuri and Cornell (1996, Section 4.5.2), and.12 Some Box-Behnken designs are rotatable, but this  design is not always rotatable. Box GEP21 list  a number of design arrangements for   
    
    control variables.
  The san cristobal design: Rojas22 introduced a variant of the CCD,  called the San Cristobal Design (SCD), for sugar farming experiments. It is  utilized in situations where the levels of  
    
   control variables are restricted to be  nonnegative, as is the case with fertilizers experiments. The SCD consists of  
    factorial points combined with center  and axial points, all contained within the positive orthant. It also includes a  control where no fertilizers are applied. More recently, the performance of  this design was evaluated by Haines LM23 who reviewed some of its properties.
 
Determination of optimum conditions
  Optimization plays a key role in any  response surface investigation. One of the main objectives of modeling the  response is to use the fitted model in determining optimum conditions on the  model's control variables that result in a maximum (or minimum) response over a  certain region of interest,
. This, of course, assumes that the  model has been screened to determine its suitability for providing an adequate  representation of the mean response over the region    
 .
  Quite often, a second-degree model is  employed after a series of experiments have been sequentially carried out  leading up to a region that is believed to contain the location of the optimum  response. We shall therefore only mention optimization techniques that are  applicable to such a model.
  Optimization of a second-degree model
  Let us consider the second-degree  model in (2.3), which can be written as
    ;(4.1)
    where     
    
    and  
     is a symmetric  matrix of order    
   whose   
    diagonal element is   
   , and its   
   diagonal element is   
   . If   
    observations are obtained on       using a design  matrix   
  as in (2.4), then (4.1) can be written  in vector form as in (2.6), where the parameter vector    
   consists of   
   and the elements of   
    and   
   . Assuming that  
    and   
   , the least-squares estimate of    
   is  
     as  given in (2.7). The predicted response at a point   
    in the region   
    is then of the form    
 
   (4.2)
  where    
  and the elements of  
     and   
   are the least-squares estimates of     
  and the corresponding elements of  
   and   
   , respectively.
  An unconstrained optimum is obtained by  optimizing    
 
 unconditionally with respect to  
    . This is achieved by taking the  partial derivatives of    
  with respect to    
  , equating each one to zero and then  solving the resulting   
    equations. The  solution to these equations provides the coordinates of the so-called stationarypoint which we denote by   
   . This point may not necessarily be a  point of optimum. For a maximum at    
 , the matrix    
   must be negative  definite, or equivalently, if its eigenvalues are all negative. For a minimum    
 , must be positive definite, or Equivalently, if its  eigenvalues are all positive (if some of the eigenvalues are positive and some  are negative, then  
    is a saddle point). Of course, an optimum is only  meaningful if  
    falls within the region  
   . If the location of the optimum falls outside this  region, then it will be necessary to use the method of ridge analysis to  determine Optimum conditions over  
   . This is explained in the next section.  
  The method of ridge analysis: When the location  of the stationary point falls outside the region of interest,   
    
   , the next step is to determine optimum  operating conditions within the boundary of    
  . For this purpose we use the method of  ridge analysis, which was originally introduced  by Hoerl AE24 and later formalized by Draper  NR.25 This method optimizes    
   in (4.2) subject to    
   being on the surface of a hyper sphere of  radius  
     and centered at the origin, namely,
 
   
 (4.3)
    This constrained optimization is  conducted using several values of       corresponding to hyper spheres contained  within the region    
 . The rationale for doing this is to  get information about the optimum at various distances from the origin within    
 . Since this optimization is subject to  the equality constraint given by (4.3), the method of Lagrange  multipliers can be used to search for the optimum. Let us there- fore  consider the function
    (4.4)
  where   
  is a Lagrange multiplier. Differentiating    
    with respect to  
    and equating the derivative to zero, we  get 
    (4.5)
    Solving for  
   , we obtain
    (4.6)
    A  maximum (minimum) is achieved at this point if the matrix     
  of second- order partial derivatives of    
   with respect to       is negative  definite (positive definite). From (4.5), this matrix is given by
   
  To achieve a maximum, Hoerl AE24 suggested that   
   be chosen larger than the largest  eigenvalue of   
   . Such a choice causes   
    to  be negative definite. Choosing   
    smaller than the smallest eigenvalue of    
  causes  
     to be positive definite which results in a  minimum. Thus, by choosing several values of      in this fashion, we can, for each   
   find the location of the optimum (maximum or  minimum) by using formula (4.6) and hence obtain the value of     
 . The solution from (4.6) is feasible  provided that     corresponds to a hyper sphere that  falls entirely within the region      . The optimal value of  
    is computed by substituting x from  (4.6) into the right-hand side of (4.2). This process generates plots of   
    and   
    against        
     These plots are useful in determining, at any  given distance    
   from the origin, the value of the optimum as  well as its location. More details concerning this method can be found in Khuri AI,15 and.11
  Khuri AI26 provided a modification of the method of ridge analysis by placing an  added constraint on the size of the prediction variance associated with the  predicted response in (4.2) within the region   
   . The rationale for the additional  constraint stems from the fact that since optimization is based on using ^y(x),  which is a random variable, it would be necessary for the prediction variance  not to be highly variable within  
    . This modification can provide better  optimization results, especially when the design used to _t model (4.2) is not  rotatable.
  The results of  ridge analysis can be easily obtained using PROC RSREG in SAS Institute, Inc.27 and adding the "Ridge Max", or  "Ridge Min" statements, depending on whether it is desired to have a  maximum response or a minimum response, respectively, over the region   
   . It should be noted that regardless of  how the control variables are coded, SAS codes the variables so that the  boundary of   
    has a radius equal to 1 (assuming that    
   is spherical). The next numerical example gives  details of the SAS code needed to _t the model, get its parameter estimates,  determine the nature of the stationary point, and finally display the results  of ridge analysis.
 -  Numerical example
 
  A central  composite rotatable design with 6 center-point replications was set upto  investigate the effects of three fertilizer ingredients on the yield of snap  beans. The fertilizer ingredients and actual amounts applied were nitrogen (N),  from 0.94 to 6.29 lb/plot; phosphoric acid   
   , from 0.59 to 2.97 lb/plot; and potash    
   , from 0.60 to 4.22 lb/plot. The  response of interest,   
   , is the average yield in pounds per plot  of snap beans. The coded variables,    
  , are given by
    
    The design  settings (in coded form) and corresponding response values are given in Table 1:15 We note that the design is rotatable  since the axial parameter value is   
   , where   
   is the number of points in the  factorial portion of this CCD. The region  
 is therefore spherical with a radius = 1.682.
  In this example,  the predicted response is
    The matrix   
    [see formula (4.2)] is given by
   
 :
 
 
    
      
   | 
      
   | 
      
   | 
      N  | 
      P2O5   | 
      K2O  | 
      Yield  | 
    
    
      -1  | 
      -1  | 
      -1  | 
      2.03  | 
      1.07  | 
      1.35  | 
      11.28  | 
    
    
      1  | 
      -1  | 
      -1  | 
      5.21  | 
      1.07  | 
      1.35  | 
      8.44  | 
    
    
      -1  | 
      1  | 
      -1  | 
      2.03  | 
      2.49  | 
      1.35  | 
      13.19  | 
    
    
      1  | 
      1  | 
      -1  | 
      5.21  | 
      2.49  | 
      1.35  | 
      7.71  | 
    
    
      -1  | 
      -1  | 
      1  | 
      2.03  | 
      1.07  | 
      3.49  | 
      8.94  | 
    
    
      1  | 
      -1  | 
      1  | 
      5.21  | 
      1.07  | 
      3.49  | 
      10.9  | 
    
    
      -1  | 
      1  | 
      1  | 
      2.03  | 
      2.49  | 
      3.49  | 
      11.85  | 
    
    
      1  | 
      1  | 
      1  | 
      5.21  | 
      2.49  | 
      3.49  | 
      11.03  | 
    
    
      -1.682  | 
      0  | 
      0  | 
      0.94  | 
      1.78  | 
      2.42  | 
      8.26  | 
    
    
      1.682  | 
      0  | 
      0  | 
      6.29  | 
      1.78  | 
      2.42  | 
      7.87  | 
    
    
      0  | 
      -1.682  | 
      0  | 
      3.62  | 
      0.59  | 
      2.42  | 
      12.08  | 
    
    
      0  | 
      1.682  | 
      0  | 
      3.62  | 
      2.97  | 
      2.42  | 
      11.06  | 
    
    
      0  | 
      0  | 
      -1.682  | 
      3.62  | 
      1.78  | 
      0.6  | 
      7.98  | 
    
    
      0  | 
      0  | 
      1.682  | 
      3.62  | 
      1.78  | 
      4.22  | 
      10.43  | 
    
    
      0  | 
      0  | 
      0  | 
      3.62  | 
      1.78  | 
      2.42  | 
      10.14  | 
    
    
      0  | 
      0  | 
      0  | 
      3.62  | 
      1.78  | 
      2.42  | 
      10.22  | 
    
    
      0  | 
      0  | 
      0  | 
      3.62  | 
      1.78  | 
      2.42  | 
      10.53  | 
    
    
      0  | 
      0  | 
      0  | 
      3.62  | 
      1.78  | 
      2.42  | 
      9.5  | 
    
    
      0  | 
      0  | 
      0  | 
      3.62  | 
      1.78  | 
      2.42  | 
      11.53  | 
    
    
      0  | 
      0  | 
      0  | 
      3.62  | 
      1.78  | 
      2.42  | 
      11.02  | 
    
  
  Table 1 Design Settings and Yield Values
 
 
 
  The stationary point    
  
     corresponding  to   
  
     is located at (- .394, - .364, - 0.175) is a saddle  point since the eigenvalues of  
  
       are 1.841,  0.367, - 3.304 which have mixed signs. Thus    
  
     is neither  positive definite nor negative definite, that is, it is indefinite. To find the  maximum of   
  
     over  
  
     , it is necessary here to use the method of ridge analysis.
  The needed  SAS code to obtain the results of ridge analysis (using PROC RSREG) is given  below 
    DATA;
    INPUT    
  Y;
    CARDS;
    (enter here the data from Table 1)
    PROC SORT;
    BY   
  ;
    RUN:
    PROC RSREG;
    MODEL   
   /LACKFIT;
    RIDGE MAX;
    RUN;
  The MODEL statement in PROC RSREG  fits a second-degree model in the control variables,     
 . Note that the statements, "PROC SORT" and  "BY  
    ," are needed to perform a lack-of-_t test [see  Section 2.6 in Khuri and Cornell (1996)] on the second-degree model. The data  are sorted by the variables   
   so that the eplicated observations at the design  center are grouped together. The actual lack- of fit test is performed by  adding the option "LACKFIT" to the MODEL statement in PROC RSREG.  Using the data in Table 1, the resulting lack-of-fit F test statistic has the  value      with 5 and 5 degrees of freedom. The corresponding  p-value is 0.1333 which is not significant at the 0.10 level.
  The results of ridge analysis are  shown in Table 2. We note that that the maximum response  value, 12.886, is attained on the boundary of    
   (identified by  the coded radius 1 which corresponds to the radius   
   ) at the point   
     
    
     Expressed in  units of pounds per plot, the corresponding levels of the original factors are
      
     , and 
    
      Coded radius  | 
      Estimated    response   | 
      
   | 
      
   | 
      
   | 
    
    
      0  | 
      10.462  | 
      0  | 
      0  | 
      0  | 
    
    
      0.1  | 
      10.575  | 
      -0.106  | 
      0.102  | 
      0.081  | 
    
    
      0.2  | 
      10.693  | 
      -0.17  | 
      0.269  | 
      0.11  | 
    
    
      0.3  | 
      10.841  | 
      -0.221  | 
      0.438  | 
      0.118  | 
    
    
      0.4  | 
      11.024  | 
      -0.269  | 
      0.605  | 
      0.12  | 
    
    
      0.5  | 
      11.243  | 
      -0.316  | 
      0.771  | 
      0.117  | 
    
    
      0.6  | 
      11.499  | 
      -0.362  | 
      0.935  | 
      0.113  | 
    
    
      0.7  | 
      11.79  | 
      -0.408  | 
      1.099  | 
      0.108  | 
    
    
      0.8  | 
      12.119  | 
      -0.453  | 
      1.263  | 
      0.102  | 
    
    
      0.9  | 
      12.484  | 
      -0.499  | 
      1.426  | 
      0.096  | 
    
    
      1  | 
      12.886  | 
      -0.544  | 
      1.589  | 
      0.089  | 
    
  
  Table 2 Details of Ridge Analysis
 
 
 
 
Applications in agricultural and food sciences
  Mead R, et al.8 were among  the first authors to explore the use of RSM in biological research. They  examined a large number of papers in biological journals to determine the  extent of using RSM ideas. They reported that "not much awareness of  current RSM methods was shown." They proposed a "joint development by  the biologist and the statistician of particular biologically reasonable models  for particular practical research problems." This is a good advice since  the practical research worker will be more interested in methods that pertain  to his (her) particular field of application rather than pursuing general  results.
  Fortunately, RSM has since become  more applicable to a wide spectrum of applied research areas, including those  with biological and agricultural applications. The development of new  statistical software and the introduction of fast computers have made it a lot  easier for practitioners to attempt more advanced RSM technique than was  possible before. The food industry, in particular, has been a prime user of RSM  since the early 1970s.5 devoted two sections  to review various applications of RSM in the food and biological sciences. I  myself was involved in one such application in determining the optimum  combination of the levels of washing temperature, washing ratio of water volume  to sample weight, and washing time on the quality of minced mullet flesh.28
  In the remainder of this section,  several papers will be cited to highlight some applications of RSM. These  papers represent only a small sample since the actual number of papers with RSM  applications is very large.
  Edmondson RN14 provided an  interesting application of RSM to greenhouse experiments and presented some  valuable insights into the use of RSM in an agricultural setting versus an  industrial one, as was mentioned earlier. Schmidt, et  al.29 investigated the effects of  cysteine and calcium chloride on the textural and water-holding characteristics  of dialyzed whey protein concentrates gel systems. These characteristics were  measured by hardness
 , cohesiveness    
  , springiness   
   , and compressible water   
   . A central composite design with five center-point  replications was used and a second-degree model was fitted to each of the four  responses. This experiment involved four response variables and is therefore  labeled as a multi response experiment. This is an important and a relatively  recent area in RSM. It has attracted a lot of attention, especially in the  context of simultaneous optimization of the various responses considered in the  of operating conditions on the control variables that result in optimum, or  near optimum, values for all the responses. Khuri, et al.15 applied a multi response optimization technique introduced by30 to the simultaneous maximization of the four  responses, namely y1, y2, y3, y4, in29  experiment. A detailed review of multi response experiments can be found in  Khuri.31,10 See also Chapter 7 in Khuri, A. I, et al.15 Another example concerning a  multi response experiment was described in Evans RA, et al.32 who considered data of seed-germination  percentages after four weeks incubation of four plant species in response to 55  alternating and constant-temperature regimes in dark laboratory germinators. A  second-degree model was fitted to the data from each of the four species. This  could have been treated as a multi response experiment involving the responses  from the four species. Evans et al., however, chose to fit the models  individually to their respective data, which is not advisable since the  responses can be correlated and such individual fits ignore any  interrelationships that may exist among the responses. Instead, multi response  techniques should be used to fit the models in the multi response system [see, for example, Section 7.2 in].15
  Keisling TC et al.33 utilized  response surface techniques to predict weed age and future weed size from weed  height. The objectives of their study were to: (a) utilize response models to  generate data for describing weed interference in soybeans, (b) present  strategies for estimating multispecies interference, and (c) project yield loss  from existing data. Such a study was designed to produce information to assist  soybean producers in recognizing economically detrimental threshold levels of  weed infestations which require the initiation of control measures. Broudiscou et al34 investigated the effects  of several mineral compounds on feed degradation and microbial growth in a  continuous culture system using RSM. The models considered were of the second  degree fitted to data generated by a nonstandard design that consisted of 16  points giving seven levels to each of the four factors in the experiment. The  design had good characteristics (by comparison to a CCD with 25 experimental  runs, as shown in their Table II on page 257), was close to being orthogonal,  and almost rotatable. The authors used Khuri AI16 measure of rotatability to  assess the percent rotatability of their design, which turned out to be 99.6 as  compared to 89.2 for the CCD. Furthermore, the design was also more G-efficient.
   RSM has received attention for modeling the  performance of agronomic experiments. For example,35  used inverse polynomials to model the yield of maize against three control  variables, namely levels of nitrogen, phosphorous, and potassium. A 33  factorial design was used and the experiment was conducted in a randomized  complete block layout with two replications per treatment combination. The  inverse polynomial model36 provided a better  fit than the traditional second-degree model. The latter model may produce  negative estimates of the yield response, which, of course, must be positive.  This shows that taking into account any physical knowledge known about the  response can be very beneficial when choosing an appropriate model.
  Food science has also benefited  from the application of RSM to its various areas of research. Diniz FM et al.37 used RSM to study the effects of pH, temperature,  and enzyme-substrate ratio (E/S) on the degree of hydrolysis of dogfish muscle  protein. The effect of the hydrolysis variables was described using a  Box-Behnken design (see Section 3.2.3). This design was also utilized by Cao W  et al.38 to investigate the effects of    
   = ultrasonic temperature  (30 70 oC),    
   = power (120 300 W), and   
    = time (10 50 min) on ultrasonic assisted extraction for  oligosaccharides from longan fruit pericarp (OLFP). Their fitted second-degree  model was then used to obtain optimum conditions on      that maximize the OLFP response. Optimization was also the goal of  a study conducted by Jiang G et al.39 to study the effects of temperature, pH, and  enzyme concentration/substrate concentration (E/S) ratio on the response,  degree of hydrolysis (DH) for a marine shrimp called acetes chinensis that was  harvested in China. The design used was a CCD for three control variables with    
   center-point replications  and an axial parameter  
   . This causes the design to be rotatable. Also, since  
   , the design has the additional uniform precision property.15 By definition, a rotatable design has the  uniform precision property if the prediction variance at the design center is  equal to the prediction variance at a distance of one (in the coded space) from  the center. Such a property for a rotatable design maintains approximate  uniform distribution of precision (in estimating the response) in the vicinity  of the design center.15 The results of Cao et  al.'s study indicated that hydrolysis of shrimp (acetes chinensis) resulted in a  maximum DH value of about 26.33 % under the optimal conditions on temperature,  pH, and E/S ratio. 
  Another optimization experiment  was carried out by Zhang, et al.40 in a study concerning pyriodoxine (PN), which is one of the three members of  the vitamin B6 group. It has broad applications in the food industry,  cosmetics, and medical supplies. RSM was successfully applied to determine  optimum operating conditions for maximum conversion of PN. The control  variables were reaction temperature, re- action time, enzyme loading, molar  substrate ratio, and water activity. The response was the conversion of PN. The  design used was a CCD whose factorial portion consisted of a one-half fraction  of a 25 factorial, its axial portion  contained 10 points with an axial parameter    
  and    
   center-point replications.  This design is rotatable since   
   where   
   is the number of factorial points. It also has the uniform  precision property since  
     (Table  4.3 in).15 A listing of several  applications of RSM in the optimization of chemical and biochemical processes  was given by.41 In addition to their review  of the recent literature on RSM applications to the aforementioned areas, they  also provided a critique concerning the misuses of RSM in some of the reviewed  articles.42–44
 
Acknowledgments
 Conflicts of interest
  
References
  
    - Box GEP, Wilson KB.  On the experimental attainment of optimum conditions. Journal of the Royal Statistical Society. 1951;B 13:1–45.
 
- Box GEP, Hunter JS. Multifactor experimental  designs for ex-ploring response surfaces. Ann  Math Statist. 1957;28(1):195–241.
 
- Box GEP, Draper NR. A basis for  the selection of a response surface design.  J Amer Stat Assoc. 1959;54:622–654.
 
- Hill WJ, Hunter WG. A review of  response surface methodology: A literature survey. Technometrics. 1966;8(4):571–590.
 
- Myers RH,  Khuri AI, Carter WH. Response surface methodology: 1966-1988. 1989;31:137–157.
 
- Myers RH. Response surface  methodology - Current status and future directions. J Qual Technology. 1999;31:30–44.
 
- Myers RH, Montgomery DC, Vining  GG, et al. Response surface methodology: A retrospective and literature survey. J Qual Technology. 2004;36:53–77.
 
- Mead, R, Pike DJ.A review of response surface  methodology from a biometric viewpoint. Biometrics.  1975;31:803–851.
 
- Myers RH. Response Surface  Methodology. Allyn and Bacon, Boston; 1971.
 
- Khuri, AI.  Multiresponse surface methodology. In: Handbook of Statistics, Vol. 13, Subir  Ghosh, CR Rao, editors. Elsevier Science B. V, Amsterdam; 1996. p. 377–406.
 
- Box GEP, Draper NR. Response  Surfaces, Mixtures, and Ridge Analyses, 2nd ed. 2007.
 
- Myers RH,  Montgomery DC, Anderson-Cook CM. Response Surface Methodology. Wiley, Hoboken,  New Jersey; 2009.
 
- Box GEP, Youle PV. The  exploration and exploitation of response surfaces: An example in the link between  the fitted surface and the basic mechanism of the system Biometrics.1955;11(3):287–323.
 
- Edmondson RN. Agricultural  response surface experiments based on four-level factorial designs. Biometrics. 1991;47(4):1435–1448.
 
- Khuri AI,  Cornell JA. Response Surfaces, 2nd ed. Dekker, New York, USA; 1996.
 
- Khuri AI.  A measure of rotatability for response surface designs, Techno-metrics.1988;30:95–104.
 
- Box GEP, Hunter WG, Hunter JS.  Statistics for Experimenters; 1978.
 
- Ratkoe  BL, Hedayat A, Federer WT. Factorial Designs. Wiley, New York, USA; 1981.
 
- Montgomery  DC. Design and Analysis of Experiments. 6th ed. Wiley, New York, USA; 2005.
 
- McLean  RA, Anderson VL. Applied Factorial and Fractional De-signs. Dekker, New York;  1984.
 
- Box GEP, Behnken DW. Some new  three-level designs for the study of quantitative variables. Technometrics. 1960;2(4):455–475.
 
- Rojas BA.  The San Cristobal design for fertilizer experiments, Proceedings of the 11th  Congress of International Society for Sugar Cane Technology. 1962. p. 197–203.
 
- Haines LM.  Evaluating the performance of non-standard designs: The San Cristobal design.  In: Response Surface Methodology and Related Topics. In: Andre I Khuri, editor.  World Scientific, Singapore; 2006. p. 251–281.
 
- Hoerl AE. Optimum solutions of  many variable equations. Chem Eng Prog.  1953;55:69–78.
 
- Draper NR. Ridge analysis of  response surfaces. Technometrics. 1963;5(4):469–479.
 
- Khuri AI  and Myers RH. Modified ridge analysis. Technometrics.  1979;21:467–473.
 
- SAS  Institute, Inc. SAS Online Doc, Version 9.1.3. Author, Cary, North Carolina; 2008.
 
- Tseo, CL,  Deng JC, Cornell JA, et al. Effect of washing treatment on quality of minced  mullet flesh. J Food Sci. 1983;48:163–167.
 
- Schmidt  RH, Illingworth BL, Deng JC, et al. Multiple regression and response surface  analysis of the effects of calcium chloride and cysteine on heat-induced whey  protein gelation. J Agric Food Chem. 1979;27:529–532.
 
- Khuri AI,  Conlon M. Simultaneous optimization for multiple responses represented by  polynomial regression functions. Technometrics. 1981;23:363–375.
 
- Khuri AI.  The analysis of multiresponse experiments: A review. In: Statistical Design and  Analysis of Industrial Experiments. Subir Ghosh, editor. Dekker, New York, USA;  1990. p. 231–246.
 
- Evans RA, Easi DA, Book DN, et  al. Quadratic response surface analysis of seed-germination trials. Weed Science. 1982;30(4):411–416.
 
- Keisling TC, Oliver LR, Crowley  RH, et al. Potential use of response surface analyses for weed management in  soybeans (Glycine max), Weed Sci. 1984;32:552–557.
 
- Broudiscou  LP, Papon Y, Broudiscou AF. Effects of minerals on feed degradation and protein  synthesis by rumen micro-organisms in a dual e uent fermenter. Reprod Nutr Dev. 1999;39:255–268.
 
- Salawu  IS, Adeyemi RA, Aremu TA. Modified inverse polynomial and ordinary polynomial  as a response surface model: A case study of nitrogen, phosphate and potassium  levels on the yield of maize. Inter J  Pure Appld Sciences. 2007;1:18–24.
 
- Nelder JA. Inverse polynomials, a  useful group of multi-factor response function. Biometrics.1962;22:128–141.
 
- Diniz FM,  Martin AM. Use of response surface methodology to describe the combined effects  of pH, temperature, and E/S ratio on the hydrolysis of dog sh (Squalus acanthias) muscle. Inter J Food Sci Tech. 2003;31:419–426.
 
- Cao W,  Zhang C, Hong P, et al. Optimization of the production of shrimp waste protein  hydrolysate using microbial proteases adopting response surface methodology. Inter J Food Sci Tech. 2009;44:1602–1608.
 
- Jiang G, Jiang Y, Yang B, et al. Structural  characteristics and antioxidant activities of oligosaccharides from longan  fruit pericarp. J Aqric Food Chem. 2009;57(19):9293–9298.
 
- Zhang DH,  Bai S, Dong XY, Sun Y. Optimization of lipase-catalyzed regioselective  acylation of pyridoxine (vitamin B6). J  Agric Food Chem. 2007;55:4526–4531.
 
- Bas D, Boyaci IH. Modeling and  optimization I: Usability of response surface methodology. Journal of Food Engineering. 2007;78(3):836–845.
 
- Box GEP, Draper NR. The choice of  a second-order rotatable design. Biometrika.1965;50(1–2):335–352.
 
- Khuri, AI.  Linear Model Methodology. Chapman & Hall/CRC, Boca Raton, Florida; 2009.
 
- Plackett  RL, Burman JP. The design of optimum multifactorial experiments. Biometrika. 1946;33:305–325.
 
  
 
 
  
  ©2017 Khuri. This is an open access article distributed under the terms of the, 
 which 
permits unrestricted use, distribution, and build upon your work non-commercially.