Frac Analysis in VISAGE: Using Distributions as an Alternative to Linear Regressions

November 11, 2014 by

Editor’s Note: While VISAGE rebranded to VERDAZO in April 2016, we haven’t changed the VISAGE name in our previous blog posts. We’re proud of our decade of work as VISAGE and that lives on within these blogs. Enjoy.

We are fortunate in Canada to have rich data sets such as Canadian Discovery’s Well Completions and Frac Database (WCFD) and the IHS Information Hub. When we integrate the frac data with public production data there are immense analysis opportunities. Anyone who has tried to apply a linear regression looking for a correlation between production performance and a particular completion parameter knows that the correlations are typically weak (see the Linear Regression example below).


Why are the correlations weak? Some of the reasons are:

  1. There are too many factors influencing a wells production performance (e.g. technology, spacing, tonnage, number of stages, base fluids …. not to mention the reservoir).
  2. The relationships may not be linear.
  3. There can be “thresholds of effectiveness”. Below a lower threshold, there is limited impact. Above an upper threshold there is also a limited impact. The range of values between the lower and upper thresholds is the “correlation window”. This is where the strongest relationships exist between the two variables.

Cumulative probability distributions (or percentile distributions) can be a powerful tool to identify the “correlation window” and further develop insights into optimal ranges for key completion parameters.

Using the same dataset as the Linear Regression Example above, we can present this data as a percentile distribution. Using VISAGE, I created the following chart to show:

  1. Distributions of Cumulative Oil / Stage for each well.
  2. The well data sets forming each distribution are established using bins (where Proppant Placed Per Stage falls within defined ranges …. see the legend)
  3. Bin sizes were selected in an attempt to have no less than 20 wells in each group. This is not always possible. The more wells you have, the more reliable the distribution.



  1. The distributions inherently align like-performing wells and incorporate the statistical variability of other factors. Multiplying uncertainties together results in a lognormal distribution. The relative placement of the distributions communicates their relative performance.
  2. There is a dramatic step change as the Proppant Placed per Stage exceeds 30 tonnes. The “correlation window” occurs between the “15 to 30” data set and the “30 to 45” data set.

Next steps:

In the next blog, we will look at the next steps in “using distributions to refine your insight”.  This includes:

  1. How to focus your analysis within the “correlation window”
  2. How to use other metrics to refine your conclusions about the optimal ranges for key completion parameters.


Production Data: IHS Information Hub

Frac Data: Well Completions and Frac Database from Canadian Discovery

Visual Analysis: VISAGE

Thanks for reading. I welcome your questions and suggestions for future blogs.

Some other blogs you may find of interest:

About VISAGE – visual analytics for the petroleum industry
VISAGE analytics software equips operators and analysts in the petroleum industry to make the most valuable and timely decisions possible. VISAGE brings together public and proprietary oil and gas data from multiple sources for easy to use interactive analysis.