Using Design of Experiment Methods for Screening and Optimization of Quantitative Factors when Results are Qualitative
Disclaimer Ver 20200525-01
THE PROBLEM
Traditional design of experiment (DOE[1]) has the sequential stages of screening, optimization and modeling by response surface mapping (RSM)[2], [3]. This is most effective when factors (inputs) and responses have objectively quantified[4] relationships and are continuous. Subjective quantification of response results for a treatment (trial) using a numerical ranking scale for qualitative results is problematic in that it does not give an objective mathematical relationship for modeling. Mathematical analyses of the results and their relationship to the input factors levels allow progression through the stages to building a model of the optimized system. This traditional methodology is best suited for the continuous quantifiable independent variables like time, temperature, pressure and mixture amounts[5]. Models are possible for categorical factors (machine, operator) to find the best one under study. No adjustments are possible between the elements of a categorical. This makes them less well suited to mathematical optimization. Apart from knowing a theoretical model of the system, the traditional approach is the method of choice. However, there are circumstances where generating a quantitative result for an experiment is difficult, costly or time consuming if not impossible. A second issue is mathematically balancing results of different responses to give a single value indicating the best trial (desirability function[6]). Whether used for quantitative or ranked qualitative responses, it requires a weighting of the relative importance of the responses. The relative importance is often a qualitative judgment. So avoiding qualitative results is not always possible even for designs with quantitative factors and quantitative response results. This signifies that the traditional approach also has to deal with qualitative issues.
THE OBJECTIVE
The objective is to detail a process to screen and optimize factors for factorial and mixture designs with the goal of reaching a robust design area when any of the results are qualitative and do not lend themselves to mathematical analysis. While intended for qualitative results, it is applicable to quantitative results.
METHOD
Overview
The titled process uses various elements of traditional DOE (vertices generations and optimality selection) to generate a small set of trial experiments from the number of factors. During the sequential process of selecting the best result from the set and using it as a starting point to generate the next set, the quantitative results will move toward an area of better results and the input factors become optimized. The procedure will screen out unnecessary factors and lead to a simpler design with optimized input factors. At the end of the process in the optimized response area, multiple trials can show similar indistinguishable results implying a more robust design area.
Because results are quantitative, the experimenter’s experience and opinion determines what is “best”. Throughout this paper best is the experimenter’s choice for any single or balance of multiple responses.
General Process
Start by choosing experimental factors for study. Then set each of them to their individual high and low levels. The criteria for determining the levels for each of the factors is that they should be just far enough apart so that any difference in results is likely real (minimal factor spanning, MFS). Generate the first set of vertex trials for the experimental design using a linear selection model. From these vertices chose a subset of all the vertices by starting with one more trial than the number of factors and repeatedly adding an additional trial while checking the optimality criteria. The optimality will maximize or show small gains for each additional trial. This is the number of trials to test in each set. Use this number of trials to generate the first trial set and their factor levels. Use the trial with the best result from this first set as a starting point to generate a second trial set for evaluation. Repeat the procedure until there are no significant improvements in results. The factors levels are optimum.
During this repeated process for mixtures, some of the factors may tend to move toward a 0 level. This indicates they are not adding any performance to the results. Remove these from the experimental design. This effectively screens them out. For factorial design the levels may not trend in any direction or bounce between the same levels. This will indicate the level range initially bracketed the optimum levels or it has no effect. Checking these factors’ influence will determine if they were initially optimal or have no important effect on the design.
Implicit in this procedure is the assumption that any interactions at the initial starting level will be similar to those at the optimized level. If this is not true, if the importance of the work warrants it or if there is a concern that this is a local optimum, a revaluation of the MFS level along with, if desired, starting at an off optimal trial point will replicate and confirm optimum found.
Benefits
Since the levels are set so the effects are linear, a linear selection model for the trials is used. The selection of a subset of possible trials based on optimality criteria reduces the number of trials in a full or partial design (fractional factorial[7], factorial[8], constrained simplex[9] or simplex[10]) that would be necessary to model higher order designs. Using optimality criteria will also select ones that are likely to find significant improvements with the minimum of effort.
The selections of the initial factors levels do not have to bracket the likely unknown optimum levels. They will progress toward it. In a traditional design that has not had factors screened and optimized, the levels chosen for the RSM could miss the optimum requiring a repeat of the design.
In a traditional design, screening and optimization are separate processes. For this process, screening and optimization occur together.
Because it involves varying all the factors at once, synergistic or antagonistic interactions are included during optimization. The final factor levels are also more robust.
When sequential testing trials in an area where there are catastrophic changes, it will move along the boundary if that is where the best results are located. It does not have the problem of missing areas near the boundary or failing to find an optimal area because of the difficulty in finding terms to model the catastrophic change.
As testing progresses, modification of the design space is possible. As the factor levels move toward an area of better results, those factors that have no benefit will go toward zero or do not change outside the initial levels. Removing them or testing them for their effect will reduce the number of additional trials needed for further optimization. The method also allows replacing the removed factors with new ones. Enlarging the factor space at any point will make it possible to test additional new factors.
The level range between low and high levels can be adapted if results are varying too widely or not enough.
When the best result is not clear-cut, the alternate trials of interest become starting points for additional trial sets in another direction, which could lead to a different local optimum.
Approaching the optimum, the results of some of the trials may become indistinguishable from each other. Making the test conditions harsher and retesting them, makes the results more distinguishable. This allows progress to continue toward an optimum. Modifying the test conditions would cause additional complexity in generating a RSM for a traditional approach, as all of the trials would require retesting instead of just those that have similar results.
A consequence of using MFS and a linear selection model between the chosen factor levels is no need for trials to deal with interactions and higher order effects. A result is that as the number of factors in a design becomes greater, the number of trials will not become excessively large, as would be the case with traditional RSM, which would need significantly more trials to model some or all of the interactions and higher order combinations.
If the traditional approach has a significant issue like an inability to determine a model or operational failure, the whole design may need abandoning and restarting. This approach is more forgiving as every set is essentially a restart.
One of the biggest benefits is that there is a defined end. When results do not improve, the factors being tested will give a result that is either acceptable for use or, with a high confidence, will never be able to achieve the desired result.
Detriments
The sequential trial sets can lead to some trials being in an already tested factor space. This results in some duplication. If trials are burdensome (cost, time, complexity, equipment limited) to the overall project, then any new points could be tested to see if they are in the convex hull[11] of all the previous trials and remove them from the new trial selections. A positive perspective is to consider this a validation point for previously tested factor spaces. It would help avoid going in a poor direction due to an erroneous false positive or false negative result.
If the results are sensitive to small changes in the factors, it may require many trials to get to an optimum area.
This process will be less effective than the best case (optimum levels bracketed, all factors affect the response results, no catastrophic change in results between factor levels, limited interaction and higher order effects) traditional approach.
DISCUSSION
Mixture design[12] factors in a simplex[13] space are the basis for this discussion. A generalized version of the method discussion is applicable to continuous factorial designs in a Cartesian space or mixed designs[14], [15].
Traditional Approach
The factors in a mixture design are the components. Mixture designs have the components under study normalized and constrained to add up to 100%. Components not under study remain at consistent levels. This is express as % component 1 + % component 2 + % component 3 + … = 100%. This requirement constrains the degrees of freedom by 1. The consequence is that for a 3-component blend; only 2 of the components are varied. The third component is set to a level that will sum to 100%. This reduces 3 dimensions (3D) to 2 dimensions (2D). Figure 1 shows a 3D Cartesian x, y, z graph. The mixture-constrained area is the 2D gray triangle. Any formulation has to be on the gray triangular plane where the coordinates (1, 0, 0) represent the percents (100%, 0%, 0%). This is the basis for triangular, ternary or simplex graphs. The x, y, z labeled axis are appropriate for continuous factors but only the 100% level of each component (1, 2, or 3) will intersect the x, y or z axis respectively.
The traditional approach selects a lower and higher level for each component space far enough apart to cover the area of interest and hopefully the optimum, chooses a selection model (linear, quadratic, cubic) so that any curvature can be modeled, generates the trials, selects the most efficient ones to run and mathematically model the results. The results and model would cover the whole factor space. The lower and higher levels are each a single level constraint.
The factor space in Figure 2 is the area inside the colored (red, green and blue) edged convex hull polygon in simplex space labeled A through F. Table 1 lists the coordinates of the simplex space.
In the ternary or simplex Figure 2, component 1 had levels selected at 20 and 60%. These are the red lines between EF and AB. Similarly, component 2 levels are the green lines (10 and 50%) and component 3 levels are the blue lines (10 and 60%). The levels for the components are the constraints. The points A through F are the vertices[16] of the constrained simplex. For the discussion, the constraints are simple level constraints but they could be a combination of single level, mixed linear[17] or non-linear constraints[18].
These are the best points to test and to make a mathematical model. They would suffice if the results linearly change from the low to the high level of a component and there are no interactions between the components or higher order effects. This may not be the case and the results might have curvature caused by the interactions or higher order effects. Choosing a quadratic or higher model will require generating other points like a point halfway between A and F or E and D (not shown). These are edge centroids. Models that are more complex may require face centroids, (point G, both the face and the overall constrained simplex centroid) or lattice centroids (like points between the vertices A-F and G, none shown). Point G is usually included in experimentation as a check for interactions or higher order effects. The more complex the model, the more points need testing.
If a good model that can predict the result within experimental error is determined and it can predict the results from a test point, no more work is necessary unless there is more than one result that needs modeling. Figure 3 shows a traditionally generated result contours for the model 1.3x + 2.5y - 0.76z + 0.007xyz = result. Max in Figure 2 and Figure 3 represents the best (maximum) result.
Multiple types of results need multiple models similar to that shown in Figure 3. They are combined
to get the most desirable outcome6. In the case where the optimum location for each of the results does not coincide, the experimenter’s judgment is necessary to weight the different types of results and the tradeoff between the best set of factors to balance the various results. Even the traditional approach can involve qualitative judgments.
Sequential Experimentation with Minimal Factor Spanning (MFS) Approach
The traditional DOE approach acknowledges the need to vary the factor levels enough to get significantly different results to avoid the variation just being noise. Often the levels chosen try to cover a large design space to minimize the amount of experimentation. Choosing wide levels then requires trials that will be used to model higher order effects. This can lead to modeling difficulties when there are catastrophic changes in the results over the levels selected and can lead to areas where experimental conditions become infeasible and no results are available for model building.
The approach described here is a sequential experimentation where the factor levels vary over a narrow range (MFS). The primary goal is to address the inability of the traditional approach to model qualitative results. Unlike the traditional approach of using separate designs for screening and optimization, both of these occur simultaneously. Additional benefits are continued progress even if there are catastrophic changes in results and the avoidance of infeasible areas. It uses some of the traditional methods to design the experiments. The traditional methods used are a DOE linear selection model for a constrained simplex trial set generation and a selection of a subset of trials based on optimality. The optimality versus the number of trials determines the number of trials in a set. As with any qualitative results, the approach has to rely on the experimenter’s experience and judgment to evaluate the results and pick the best one.
The procedure begins similar to the traditional approach. Select components (factors) of interest and select low and high levels for each component. However, the results between the selected levels should give results that are just far enough apart so that the difference is likely real. Real means they should be reproducible and outside of the normal replication variation. Figure 4 shows the concept of MFS in graphical detail. The low and high levels can be determined by running one factor at a time (OFAT[19]) on each component. If the experimenter has familiarity with the effect of the components in the formulation, they can use levels based on their experience. However, using OFAT to determine the levels will avoid any bias or incorrect beliefs.
The procedure can vary depending of the program[20]. Some programs ask for the number of trials and then give some type optimality or efficiency. The discussion uses G-efficiency (global, G-Eff %). The preference is for D-optimal designs that maximize the D-efficiency (determinate). D-optimality is a volume criterion on the generalized variance of the parameter estimates[21]. D-optimality follows a similar pattern as the G-optimality used here. Starting with one more trial than design components (number of factors or dimensions), the optimality is noted. Add more trials one at a time and monitor the optimality gains. The gains become smaller for each additional trial or the trend worsens. From this, select the number of trials to use. Using the blends in Figure 2, which do not represent MFS, the number of trials that had a good G-optimality (90%) was 4. The procedures (trials generation and optimal selection) chose trials A, B, D and F as the best ones to test while eliminating C and E. Removing 2 of the 6 trials reduces the amount of experimentation by 33% [(6-4)/6*100].
Points H to M (Figure 12) is the MFS constrained simplex translated to a different part the ternary blend space. The MFS example is similar in shape to that in Figure 2 but covers a much smaller area. The optimal selection of candidate points (A to F and H to M) has the same relationship. 16 experiment trials are usually sufficient to evaluate 8 factors. The results of the individual trials in a trial set are determined. The factor levels giving the best result become the starting point to generate a second trial set. Repeat the process and eventually a trial set will show no improved results. Typically, this occurs within 3-5 design sets from a reasonable starting point, with about 6 components and in a system familiar to the experimenter. A poor starting point with components that are unfamiliar to the experimenter and having many components can take about 10 repetitions. The worst-case encountered required 7 trial sets for 8 unfamiliar components.
Often qualitative test results for several of the trials become indistinguishable as to which is better. If the results are acceptable, averaging the factors would give more robust factors levels and the process is completed. If results are not acceptable and improvement needed, it becomes necessary to alter the test conditions to stress the trials for differentiation. Changing test conditions would be devastating in the traditional approach as all the trials would have to be re-run and a new model generated. In this method, only indistinguishable ones need retesting. As testing the trial sets continues, it is possible that each set will begin to show a larger percentage of trials becoming indistinguishable from each other. This is a further indication the factors are moving to a more robust area.
It likely will include some judgment in balancing multiple responses for the best overall results. In balancing the multiple responses, a traditional procedure involves setting up desirability functions. While this can be useful if the results are quantitative, qualitative results are less amenable to this. Giving ratings to qualitative results are at the judgment of the experimenter and they suffer from no objective relationship, so modeling them can be problematic. The desirability functions also suffer from the same judgment issue, even if the results are quantitative, how much weight to give one type of response compared to another type remains at the judgment of the experimenter. One benefit of this procedure is if there is no agreement on which trial is best; the alternate best trials become starting points for a completely new series to see if it leads to a better local optimum.
Illustrative Simulation
Optimization
This example uses an initial starting level for each of the 9 components (factors, independent variables) in the blend along with the range for the MFS range (Table 2). Adding and subtracting half the MFS for each component to their respective initial start level gave low and high levels used as constraints in a constrained mixture design. F1V1 refers to formula 1 vertex trial set 1. There are 396 vertices as possible trial candidate points for F1V1. No edge, face or lattice centroids were included.
Table 3 has the 20 trials selected for evaluation. The results can be qualitative, quantitative or a combination of several results. From those 20, the best was number 9. Since this is a simulation of a real world situation, there are no actual qualitative results. Best is the sum of all absolute differences between a component level in the trial blend and that level selected for a preselected optimum blend normalized by dividing it by half the MFS range. As an example in Table 4, 2.83 is the value between the levels selected as the optimum and the optimum blend found from this process.
Table 3 also includes the beginning of the next series (F1V2, formulation 1 vertex trial set 2). F1V2 did not have the G-efficiencies re-determined but simply used the number of trials initially determined for F1V1 (20) to select the F1V2 trials (not shown). This procedure was continued.
By F1V5, components 3 and 5 were bounding a zero level in the blend. With the range for them being 2, the low level was -1% in the trial blend. Any negative values are set to 0. These components were included but could have been removed in F1V6 and F1V7. Their removal would not alter the progress toward optimization. Both component 3 and 5 in the best blends for both F1V6 and F1V7 were at the 0% level. Component 7 was also optimizing out to a 0% level by F1V7. For F1V8 trial set, setting 3, 5, and 7 to 0% reduced the blend to 6 components. Going from 9 factors to 6 factors reduced the number of candidate trials from 396 to 56. The G-efficiency versus the number of trial blends was then reevaluated (Figure 8). Eight was the number of trials used in subsequent vertex trial sets. The removal of factors significantly reduces the size of the candidate trials and the number of trials in the set. Instead of removing factors from the design, alternate components could have been substituted for screening and optimization.
F1V8 and F1V9 used the reduced number of factors (6) and trials (8). By this point, the levels of the components were steady and simply alternated between the MFS range limits chosen for them (Figure 9 and Figure 10).
Figure 9 Two Components Level Trend for a 9 Component Mixture
Figure 10 Six Components Level Trend for a 9 Component Mixture
In the F1V9 trial set, 2 of the trial blends were equal in performance. An average of these blends would give a more robust formulation. It also indicates the factors are optimizing toward a more robust area. In this illustrative example, the components of the 2 trial blends (F1V9 #149, F2V9 #154) were at the same level except for components 8 (2.5%, 3.5%) and 9 (3.5%, 2.5%) respectively. The average for both would be 3.0%. The level difference in each component 8 and 9 is 1%. This is the same as the MFS range. Since all the components are in the MFS range, additional optimization progress is not possible. Table 4 is the final components levels compared to the target optimum.
Convex Simplex Hull Trial Reduction
One of the concerns with the procedure is that some of the trials in a subsequent trial set could be contained within the factor space already tested in previous trial sets. This would represent redundancy. Where running trials are not burdensome (cost, time, complexity, equipment limited), this is not an issue. The redundant points could have some value. If they were not as good as the best point, they would re-validate that the area is of less interest than the direction of optimization. If one is better, it could indicate that the best result is suspect and require re-evaluation.
A simplex (ternary, tetrahedron or hypertetrahedron) is an n-1 dimension convex hull in an n-dimensional space. Finding a convex hull in a simplex space is essentially finding a hull inside a hull. There appears to be little information on the determination of a convex hull in a simplex space in open literature (internet). The information found discusses the convex hull in Cartesian space. The convex simplex hull is as CSH.
Using the full 3 and 9 factors (coordinates) discussed previously along with linear programming and Delaunay triangulation in Python to determine if subsequent points were in the hull of previous points gave calculation warnings. The results were suspect. For this discussion, averaging all the coordinate points of the referenced trials defines the coordinate averaged centroid (CAC). This is not the same as the typically defined geometric or center of mass centroid. The CAC will still be within the convex hull but not necessarily at the geometric centroid. The overall CAC for all the vertices (148) in the F1V1 to F1V8 sets was not in the hull formed by all the F1V1 to F1V8 vertices. This is not possible. This process was unusable. To use the standard methods for determining if a point is in a convex hull, the input has the mixture coordinates (dimensions) reduced by 1. This modification and the process of determining if a point is in the CSH is designated InHullCSH.
The 3 component mixtures (F2) used only 2 coordinates and F1 used only 8 of the 9 coordinates. If the x (1, 0, 0), y (0, 1, 0) and z (0, 0, 1) coordinates are reduced by removing the z dimension, the coordinates become (1, 0), (0, 1) and (0, 0). This would map any point (dashed arrow Figure 11) on the dark gray mixture triangle in Figure 1 to the Cartesian coordinates (x, y) on the darker gray triangle in Figure 11.
Three Component Example Blend
Reducing the level ranges to MFS levels, translating the points to the lower left section of the full ternary diagram shown in Figure 2 and reducing the simplex range gives Figure 12 in which the component levels do not span 0 to 100%. Component 1 ranges from 0 to 50%. Component 2 ranges from 0 to 50% and component 3 ranges from 50 to 100%. Figure 12 diagrams 2 trial sets and the optimally selected trials for 3 components.
Figure 12 shows a plot of 6 vertices (H to M) for 3 components generated using low and high MFS level constraints. These points are congruent to those in Figure 2. The G-efficiency optimal design method using 4 for the number or trials selected the points H, I, K and M. The green line defines the CSH for this first trial set (F2V1). Point K by selection is the best and the procedure repeated. The numbers (1 to 6) indicate vertices for the second trial set (F2V2). Optimal selection excluded points 2 and 6. The yellow lines define the CSH for the second set. It shows one trial (Point 1) from the second set (yellow F2V2) is within the CSH of the first trial set (green F2V1). InHullCSH, using both a Python linear programming routine and a Python Delaunay triangulation routine, determined that point 1 was in the CSH of F2V1 and the others (3, 4, 5) were not.
Nine Component Example Blend
For the F1 nine component optimization series, there are 156 total trial points #1 to #156 contained in the 9 vertex trial sets. It consisted of 20 vertices in each of the trial sets F1V1 to F1V7 and 8 vertices in trial sets F1V8 and F1V9. Since there is no previous vertices for F1V1 (#1 to #20), the initial 20 vertices of cannot be in any prior CSH. Checking the vertices with the InHullCSH process, no F1V2 trial set vertices (#21 to #40) were in the CSH formed by the trial vertices of F1V1. Similarly, no F1V3 trial set vertices (#41 to #60) were in the CSH formed by the trial vertices of F1V1+F1V2 (#1 to #40). Repeating the process, until the final trail set F1V9 (#149 to #156) was checked in all the previous trials (#1 to #148), showed only 3 individual vertices out of 136 vertices (#21 to #156) contained in previous CSHs. This indicates only a minor amount of redundant formulation testing could occur.
For a check on the InHullCSH modification, each CAC of a single trial set was in the CSH for all vertex sets up to and including that set. The F1V1 CAC was in the CSH formed by the points in F1V1 set. The F1V2 CAC was in the CSH formed by the points in F1V1 +F1V2 trial sets and so on as expected. As another check, the CAC for each set was not in the previous sets points as would be expected. The F1V2 CAC was not in the CSH of the F1V1 trials. The F1V3 CAC was not in the CSH of the F1V1 + F1V2 trials and so on. For more confidence in the modification, a limited number of cases done using the slower Delaunay triangulation method gave the same results.
While the InHullCSH modification may work practically, it may not be rigorous. Mapping the simplex coordinates to a Cartesian space and using them would be a more rigorous approach to using the method to determine if a point were in a convex hull. The Cartesian space will have 1 less dimension but this in essence the same as the used modification as shown in Figure 11. The remapping consists of setting a single vertex in the initial vertices set (V1) as the origin for the Cartesian coordinates. Then transform the simplex vertices coordinates to Cartesian coordinates for generating a convex hull and checking if new vertices are in the previous convex hull. Selecting Component 3 as the origin, the simplex and Cartesian coordinates for the 3 components are in Table 6.
REFERENCES - External Links and Citations
Google Docs converts these from endnotes to footnotes.
Rev. 20200325-01
No comments:
Post a Comment