A Simple Example
Let’s take a simple example of a two-factor design. The factors are X1 and X2 and the response is Y.
You decide to fit a response surface model using a 10-run face-centered central composite design. This design has two center points (rows 5 and 7).
Here is a plot of your design points:
Note that all of your design points are located on the boundary or at the center of the design region. This is a great design if your response surface is, in fact, quadratic.
But what if the true response surface is not quadratic? What if this is the true response surface:
If this is the true response surface, you have selected very good design points for fitting the wrong model!
Your Fitted Model
Let’s assume that there is no noise in the data, so that the response values at your ten design points are the actual values given by the response surface. This figure shows the design points and their values on the true response surface.
Given the location of the design points – all on the boundary or at the center of the design region – the resulting fitted quadratic response surface looks like this:
“That’s not quadratic,” you say. But it is! The linear terms dominate the fitted model. Here is the equation for the model:
Y = 40 - 12.072*X1 + 13.333*X2 + 0.300*X1*X2 - 0.300*X1*X1 - 0.500*X2*X2
Let’s compare the fitted model to the true response function. The true response function is green, the fitted model is blue.
If you are trying to find settings to match a target, or to maximize or minimize your response, the conclusions that you draw from your fitted model will be erroneous, and likely seriously so.
“How can I do better?”
- Don’t assume that you know the shape of the response function. More and more, we find that response functions are complex. Default to the assumption that it has a complex shape.
- Fit a flexible model that can capture that complex shape. We find that neural nets provide flexible fits. However, to use neural nets with small data sets requires an innovative methodology called SVEM (Self-Validating Ensemble Modeling).
- Use design settings that cover the space essentially uniformly. We call such a design a space-filling design.
In our example, we used ten design points. Here is a space-filling design consisting of ten design points:
Note that this design gives much better coverage of the design space and supports the fitting of flexible models that reflect the nuances of the true response surface.
The figure below shows the SVEM fit (blue) and the true response surface (green). Note that the SVEM fit is very close to the true response surface, especially in the interior of the design region.
“But how many design points?”
So back to the original question of how many design points are needed. The answer depends on the complexity of the true response surface and the type of model-fitting algorithm that you use.
We advocate the use of space-filling designs with the SVEM modeling technique. In future posts, we will talk more about SVEM and the recommended number of design points, but generally, SVEM combined with space-filling designs requires significantly fewer runs than do classical approaches.
For over 25 years, Predictum has enabled companies to achieve higher levels of productivity, operational improvement and innovation, and realize significant savings in cost, materials, and time. Our team of engineers, data scientists, statisticians, and programmers leverages deep expertise across various industries to provide our clients with unique solutions and services that transform data into insightful discoveries in engineering, science, and research. To get in touch with our team, visit www.predictum.com/contact.