Multivariate Analysis: Completion Optimization’s Silver Bullet?

October 17, 2016 by

The term Multivariate Analysis has gained in popularity – and hype – in the oil and gas industry, particularly as it pertains to completion analysis. The goal of Multivariate Analysis is typically to understand the relationship of multiple input variables to one or more outcomes, attempting to isolate the effect each individual input variable has on a particular outcome. Given the many geological and completion parameters that influence production profiles, it is not surprising that the industry is embracing Multivariate Analysis in its search for optimal completion designs. The use of this term has evolved to include a broad range of tools and techniques, such as:

  1. Visual tools like parallel coordinates that visually communicate the relationships between inputs and outcomes
  2. Workflows that leverage mathematical, statistical and visual techniques to identify the most pertinent inputs and determine optimal design considerations (this is where VERDAZO excels)
  3. Regression analysis (often perceived as a “black box”) that aims to isolate the most pertinent inputs that influence optimal completion designs and arrive at a predictive equation

Blackbox3D-withGraphs-croppedImage source: Wikipedia

Multivariate analysis encompasses a broad range of tools, techniques, technologies and workflows… but not without notable dangers. I sat down with Tyler Schlosser, Director of Commodities Research, GLJ Petroleum Consultants, who has considerable experience and expertise in this area, for a conversation about Multivariate Analysis. Here’s what came out of our conversation…

The Most Common Pitfalls

a)        Analogue selection: both quantity and quality matter

Quality: If the analogue wells are not similar to the region you plan to drill and complete, then your conclusions will be less relevant and predictive.

Quantity: It’s important that your dataset has a sufficient number of analogue wells relative to the number of input variables being considered. A multivariate analysis examining the impact of 20 variables on a set of just 30 wells will likely yield some misleading results.

b)        Assuming independence of inputs

The assumption that all inputs are independent from one another is usually false. Accounting for dependencies between inputs is crucial in drawing accurate conclusions about which input variables have the greatest effect on the outcome. For example, the number of stages, completed length, proppant density, and stage spacing are all related to each other and do not each contribute entirely unique information.

c)        Assuming linearity in correlations

Many, if not most, relationships in the oil and gas world are nonlinear. Forcing nonlinear relationships into a linear framework will likely yield misleading results.

d)        Thresholds

When correlations exist, they are not always continuous. The range of values where the correlation is strongest may be limited by thresholds, values above or below which the correlation is weaker or does not exist. Consider the example where an increase in proppant density has no significant impact until it gets above a specific threshold.

e)        Not applying dimensional normalization when you should

For many analysis goals, it can be important to use dimensional normalization (e.g. production/100 m completed length) to properly isolate the effect of a specific input. For completion analysis, the most common variables used to dimensionally normalize an output include completed length, tonnes of proppant placed, completion costs, and number of stages.

f)        Data availability and quality

It can be difficult in Multivariate Analysis to identify when issues exist in the data. Consider this scenario: after a certain date, completion costs in a particular area of the Montney appeared to drop significantly. In reality, the costs for each operator remained the same, but one high-cost operator stopped reporting their cost values. Recognizing when nuances like this exist within the data is required to avoid being misled toward false conclusions.

Important Considerations When Selecting Your Tool/Approach

1)      Transparency

When selecting your tool, or approach, remember that the transparency of your conclusions will be important to decision makers. The ability to explain results is important – it adds credibility in crafting a narrative. An answer from a “black box” regression analysis, no matter how accurate, lacks transparency and may not be easily explained. Listen to this Spark podcast, past the 5 minute mark, for a good example of the black box nature of some algorithms.

2)     Accessibility

Regression models typically require a strong technical and mathematical aptitude to deliver meaningful results. However, those results, and the calculations used to arrive at them, may not be easily understood by the target audience. Visual analysis tools and techniques will ensure that the results are more accessible, intuitive and capable of communicating nuances in the data.


Multivariate Analysis offers powerful capabilities in discovering, identifying and explaining relationships between several variables. As with any analytics tool, it requires critical thought to set up correctly and careful interpretation of the results in order to draw the right conclusions and effectively communicate them. Avoiding common pitfalls and crafting an effective narrative from the insights learned in Multivariate Analysis should help the diligent data analyst receive maximum buy-in from decision-makers. One tool or technique may not be adequate on its own… consider complimentary techniques. Given the cost and complexity of completions, an investment in technology, and in the time to use it properly, is becoming increasingly important.


Thanks for reading. We welcome your questions and suggestions for future blogs.

Some other blogs you may find of interest:

Verdazo Analytics helps companies make smarter, faster decisions. VERDAZO software reveals the hidden insights in complex data through visual analysis tools, pre-built templates, custom reports and our dynamic discovery analytics workflows. Business users will be doing effective analysis on nearly any data source within minutes of getting set up. Verdazo Analytics: the leader in discovery analytics software for the Oil & Gas industry.