Class Quantile
For values of length n:
- The result is
NaNifn = 0. - The result is
values[0]ifn = 1. - Otherwise the result is computed using the
Quantile.EstimationMethod.
Computation of multiple quantiles will handle duplicate and unordered
probabilities. Passing ordered probabilities is recommended if the order is already
known as this can improve efficiency; for example using uniform spacing through the
array data, or to identify extreme values from the data such as [0.001, 0.999].
This implementation respects the ordering imposed by
Double.compare(double, double) for NaN values. If a NaN occurs
in the selected positions in the fully sorted values then the result is NaN.
The NaNPolicy can be used to change the behaviour on NaN values.
Instances of this class are immutable and thread-safe.
Support for long arrays
The result on long values can be returned as a double or
a long using a StatisticResult.
The double result is computed within 1 ULP of the exact result. In some
cases this may be outside the range defined by the minimum and maximum of the input
array following rounding to a 53-bit floating point representation. For example a
quantile of an array containing only Long.MAX_VALUE as a double is
263, which is the closest representation of 263 - 1.
The long result is returned using the nearest whole number.
In the event of ties the result is rounded towards positive infinity.
This value will always be within the range defined by the minimum and maximum
of the input array. Due to interpolation it may be a value not observed in
the input values.
Interpolation between two long values requires extended precision
floating-point arithmetic. This can be avoided using a discontinuous Quantile.EstimationMethod.
In this case the long quantile will be a value observed in the input values.
If the array length n is zero the result as a double is
NaN and the result as a long will raise an ArithmeticException.
Multiple quantile results required as only one of the primitive values can be converted to a primitive array using a stream, for example:
long[] values = ...
double[] p = Quantile.probabilities(10);
Quantile q = Quantile.withDefaults();
long[] result = Arrays.stream(q.evaluate(values, p))
.mapToLong(StatisticResult::getAsLong)
.toArray();
- Since:
- 1.1
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumEnumerates estimation methods for a quantile. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final booleanFlag to indicate if the data should be copied.private static final QuantileDefault instance.private final Quantile.EstimationMethodEstimation type used to determine the value from the quantile.private static final StringMessage when the number of probabilities in a range is not valid.private static final StringMessage when the probability is not in the range[0, 1].private static final StringMessage when the size is not valid.private final NaNPolicyNaN policy for floating point data.private final NaNTransformerTransformer for NaN data.private static final StringMessage when no probabilities are provided for the varargs method. -
Constructor Summary
ConstructorsModifierConstructorDescriptionprivateQuantile(boolean copy, NaNPolicy nanPolicy, Quantile.EstimationMethod estimationType) -
Method Summary
Modifier and TypeMethodDescriptionprivate static voidcheckNumberOfProbabilities(int n) Check the number of probabilitiesnis strictly positive.private static voidcheckProbabilities(double... p) Check the probabilitiespare in the range[0, 1].private static voidcheckProbability(double p) Check the probabilitypis in the range[0, 1].private static voidcheckSize(int n) Check thesizeis positive.private doublecompute(double[] values, int from, int to, double p) Compute thep-th quantile of the specified range of values.private double[]compute(double[] values, int from, int to, double... p) Compute thep-th quantiles of the specified range of values.private doublecompute(int[] values, int from, int to, double p) Compute thep-th quantile of the specified range of values.private double[]compute(int[] values, int from, int to, double... p) Evaluate thep-th quantiles of the specified range of values..private StatisticResultcompute(long[] values, int from, int to, double p) Compute thep-th quantile of the specified range of values.private StatisticResult[]compute(long[] values, int from, int to, double... p) Evaluate thep-th quantiles of the specified range of values..private int[]computeIndices(int n, double[] p, double[] q, int offset) Compute the indices required for quantile interpolation.doubleevaluate(double[] values, double p) Evaluate thep-th quantile of the values.double[]evaluate(double[] values, double... p) Evaluate thep-th quantiles of the values.doubleevaluate(int[] values, double p) Evaluate thep-th quantile of the values.double[]evaluate(int[] values, double... p) Evaluate thep-th quantiles of the values.doubleevaluate(int n, IntToDoubleFunction values, double p) Evaluate thep-th quantile of the sorted values provided as adouble.double[]evaluate(int n, IntToDoubleFunction values, double... p) Evaluate thep-th quantiles of the sorted values provided as adouble.evaluate(long[] values, double p) Evaluate thep-th quantile of the values.evaluate(long[] values, double... p) Evaluate thep-th quantiles of the values.evaluateAsLong(int n, IntToLongFunction values, double p) Evaluate thep-th quantile of the sorted values provided as along.evaluateAsLong(int n, IntToLongFunction values, double... p) Evaluate thep-th quantiles of the sorted values provided as along.doubleevaluateRange(double[] values, int from, int to, double p) Evaluate thep-th quantile of the specified range of values.double[]evaluateRange(double[] values, int from, int to, double... p) Evaluate thep-th quantiles of the specified range of values.doubleevaluateRange(int[] values, int from, int to, double p) Evaluate thep-th quantile of the specified range of values.double[]evaluateRange(int[] values, int from, int to, double... p) Evaluate thep-th quantiles of the specified range of values..evaluateRange(long[] values, int from, int to, double p) Evaluate thep-th quantile of the specified range of values.evaluateRange(long[] values, int from, int to, double... p) Evaluate thep-th quantiles of the specified range of values..static double[]probabilities(int n) Generatenevenly spaced probabilities in the range[0, 1].static double[]probabilities(int n, double p1, double p2) Generatenevenly spaced probabilities in the range[p1, p2].Return an instance with the configuredNaNPolicy.Return an instance with the configuredQuantile.EstimationMethod.withCopy(boolean v) Return an instance with the configured copy behaviour.static QuantileReturn a new instance with the default options.
-
Field Details
-
INVALID_PROBABILITY
Message when the probability is not in the range[0, 1].- See Also:
-
NO_PROBABILITIES_SPECIFIED
Message when no probabilities are provided for the varargs method.- See Also:
-
INVALID_SIZE
-
INVALID_NUMBER_OF_PROBABILITIES
Message when the number of probabilities in a range is not valid.- See Also:
-
DEFAULT
Default instance. Method 8 is recommended by Hyndman and Fan. -
copy
private final boolean copyFlag to indicate if the data should be copied. -
nanPolicy
NaN policy for floating point data. -
nanTransformer
Transformer for NaN data. -
estimationType
Estimation type used to determine the value from the quantile.
-
-
Constructor Details
-
Quantile
- Parameters:
copy- Flag to indicate if the data should be copied.nanPolicy- NaN policy.estimationType- Estimation type used to determine the value from the quantile.
-
-
Method Details
-
withDefaults
Return a new instance with the default options.Note: The default options configure for processing in-place and including
NaNvalues in the data. This is the most efficient mode and has the smallest memory consumption.- Returns:
- the quantile implementation
- See Also:
-
withCopy
Return an instance with the configured copy behaviour. Iffalsethen the input array will be modified by the call to evaluate the quantiles; otherwise the computation uses a copy of the data.- Parameters:
v- Value.- Returns:
- an instance
-
with
Return an instance with the configuredNaNPolicy.Note: This implementation respects the ordering imposed by
Double.compare(double, double)forNaNvalues:NaNis considered greater than all other values, and allNaNvalues are equal. TheNaNPolicychanges the computation of the statistic in the presence ofNaNvalues.NaNPolicy.INCLUDE:NaNvalues are moved to the end of the data; the size of the data includes theNaNvalues and the quantile will beNaNif any value used for quantile interpolation isNaN.NaNPolicy.EXCLUDE:NaNvalues are moved to the end of the data; the size of the data excludes theNaNvalues and the quantile will never beNaNfor non-zero size. If all data areNaNthen the size is zero and the result isNaN.NaNPolicy.ERROR: An exception is raised if the data containsNaNvalues.
Note that the result is identical for all policies if no
NaNvalues are present.- Parameters:
v- Value.- Returns:
- an instance
-
with
Return an instance with the configuredQuantile.EstimationMethod.- Parameters:
v- Value.- Returns:
- an instance
-
probabilities
public static double[] probabilities(int n) Generatenevenly spaced probabilities in the range[0, 1].1/(n + 1), 2/(n + 1), ..., n/(n + 1)
- Parameters:
n- Number of probabilities.- Returns:
- the probabilities
- Throws:
IllegalArgumentException- ifn < 1
-
probabilities
public static double[] probabilities(int n, double p1, double p2) Generatenevenly spaced probabilities in the range[p1, p2].w = p2 - p1 p1 + w/(n + 1), p1 + 2w/(n + 1), ..., p1 + nw/(n + 1)
- Parameters:
n- Number of probabilities.p1- Lower probability.p2- Upper probability.- Returns:
- the probabilities
- Throws:
IllegalArgumentException- ifn < 1; if the probabilities are not in the range[0, 1]; orp2 <= p1.
-
evaluate
public double evaluate(double[] values, double p) Evaluate thep-th quantile of the values.Note: This method may partially sort the input values if not configured to
copythe input data.Performance
It is not recommended to use this method for repeat calls for different quantiles within the same values. The
evaluate(double[], double...)method should be used which provides better performance.- Parameters:
values- Values.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- if the probabilitypis not in the range[0, 1]; or if the values contain NaN and the configuration isNaNPolicy.ERROR- See Also:
-
evaluateRange
public double evaluateRange(double[] values, int from, int to, double p) Evaluate thep-th quantile of the specified range of values.Note: This method may partially sort the input values if not configured to
copythe input data.Performance
It is not recommended to use this method for repeat calls for different quantiles within the same values. The
evaluateRange(double[], int, int, double...)method should be used which provides better performance.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- if the probabilitypis not in the range[0, 1]; or if the values contain NaN and the configuration isNaNPolicy.ERRORIndexOutOfBoundsException- if the sub-range is out of bounds- Since:
- 1.2
- See Also:
-
compute
private double compute(double[] values, int from, int to, double p) Compute thep-th quantile of the specified range of values.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- if the probabilitypis not in the range[0, 1]
-
evaluate
public double[] evaluate(double[] values, double... p) Evaluate thep-th quantiles of the values.Note: This method may partially sort the input values if not configured to
copythe input data.- Parameters:
values- Values.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- if any probabilitypis not in the range[0, 1]; no probabilities are specified; or if the values contain NaN and the configuration isNaNPolicy.ERROR- See Also:
-
evaluateRange
public double[] evaluateRange(double[] values, int from, int to, double... p) Evaluate thep-th quantiles of the specified range of values.Note: This method may partially sort the input values if not configured to
copythe input data.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- if any probabilitypis not in the range[0, 1]; no probabilities are specified; or if the values contain NaN and the configuration isNaNPolicy.ERRORIndexOutOfBoundsException- if the sub-range is out of bounds- Since:
- 1.2
- See Also:
-
compute
private double[] compute(double[] values, int from, int to, double... p) Compute thep-th quantiles of the specified range of values.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- if any probabilitypis not in the range[0, 1]; or no probabilities are specified.
-
evaluate
public double evaluate(int[] values, double p) Evaluate thep-th quantile of the values.Note: This method may partially sort the input values if not configured to
copythe input data.Performance
It is not recommended to use this method for repeat calls for different quantiles within the same values. The
evaluate(int[], double...)method should be used which provides better performance.- Parameters:
values- Values.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- if the probabilitypis not in the range[0, 1]- See Also:
-
evaluateRange
public double evaluateRange(int[] values, int from, int to, double p) Evaluate thep-th quantile of the specified range of values.Note: This method may partially sort the input values if not configured to
copythe input data.Performance
It is not recommended to use this method for repeat calls for different quantiles within the same values. The
evaluateRange(int[], int, int, double...)method should be used which provides better performance.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- if the probabilitypis not in the range[0, 1]IndexOutOfBoundsException- if the sub-range is out of bounds- Since:
- 1.2
- See Also:
-
compute
private double compute(int[] values, int from, int to, double p) Compute thep-th quantile of the specified range of values.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- if the probabilitypis not in the range[0, 1]
-
evaluate
public double[] evaluate(int[] values, double... p) Evaluate thep-th quantiles of the values.Note: This method may partially sort the input values if not configured to
copythe input data.- Parameters:
values- Values.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- if any probabilitypis not in the range[0, 1]; or no probabilities are specified.
-
evaluateRange
public double[] evaluateRange(int[] values, int from, int to, double... p) Evaluate thep-th quantiles of the specified range of values..Note: This method may partially sort the input values if not configured to
copythe input data.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- if any probabilitypis not in the range[0, 1]; or no probabilities are specified.IndexOutOfBoundsException- if the sub-range is out of bounds- Since:
- 1.2
-
compute
private double[] compute(int[] values, int from, int to, double... p) Evaluate thep-th quantiles of the specified range of values..Note: This method may partially sort the input values if not configured to
copythe input data.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- if any probabilitypis not in the range[0, 1]; or no probabilities are specified.
-
evaluate
Evaluate thep-th quantile of the values.Note: This method may partially sort the input values if not configured to
copythe input data.Performance
It is not recommended to use this method for repeat calls for different quantiles within the same values. The
evaluate(long[], double...)method should be used which provides better performance.- Parameters:
values- Values.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- if the probabilitypis not in the range[0, 1]- Since:
- 1.3
- See Also:
-
evaluateRange
Evaluate thep-th quantile of the specified range of values.Note: This method may partially sort the input values if not configured to
copythe input data.Performance
It is not recommended to use this method for repeat calls for different quantiles within the same values. The
evaluateRange(long[], int, int, double...)method should be used which provides better performance.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- if the probabilitypis not in the range[0, 1]IndexOutOfBoundsException- if the sub-range is out of bounds- Since:
- 1.3
- See Also:
-
compute
Compute thep-th quantile of the specified range of values.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- if the probabilitypis not in the range[0, 1]- Since:
- 1.3
-
evaluate
Evaluate thep-th quantiles of the values.Note: This method may partially sort the input values if not configured to
copythe input data.- Parameters:
values- Values.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- if any probabilitypis not in the range[0, 1]; or no probabilities are specified.- Since:
- 1.3
-
evaluateRange
Evaluate thep-th quantiles of the specified range of values..Note: This method may partially sort the input values if not configured to
copythe input data.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- if any probabilitypis not in the range[0, 1]; or no probabilities are specified.IndexOutOfBoundsException- if the sub-range is out of bounds- Since:
- 1.3
-
compute
Evaluate thep-th quantiles of the specified range of values..Note: This method may partially sort the input values if not configured to
copythe input data.- Parameters:
values- Values.from- Inclusive start of the range.to- Exclusive end of the range.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- if any probabilitypis not in the range[0, 1]; or no probabilities are specified.
-
evaluate
Evaluate thep-th quantile of the sorted values provided as adouble.This method can be used when the values of known size are already sorted. It can be used for primitive types not supported by other evaluation methods. Numeric types
byte,shortandfloatcan be converted to typedoublewithout loss of precision.short[] x = ... Arrays.sort(x); double q = Quantile.withDefaults().evaluate(x.length, i -> x[i], 0.05);If the sorted array is a
longdatatype this method can lose information about the precision of the quantiles due to primitive type conversion. Use the methodevaluateAsLong(int, IntToLongFunction, double)to compute thelongquantile result.- Parameters:
n- Size of the values.values- Values function.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- ifsize < 0; or if the probabilitypis not in the range[0, 1].- See Also:
-
evaluate
Evaluate thep-th quantiles of the sorted values provided as adouble.This method can be used when the values of known size are already sorted. It can be used for primitive types not supported by other evaluation methods. Numeric types
byte,shortandfloatcan be converted to typedoublewithout loss of precision.short[] x = ... Arrays.sort(x); double[] q = Quantile.withDefaults().evaluate(x.length, i -> x[i], 0.25, 0.5, 0.75);If the sorted array is a
longdatatype this method can lose information about the precision of the quantiles due to primitive type conversion. Use the methodevaluateAsLong(int, IntToLongFunction, double...)to compute thelongquantile result.- Parameters:
n- Size of the values.values- Values function.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- ifsize < 0; if any probabilitypis not in the range[0, 1]; or no probabilities are specified.- See Also:
-
evaluateAsLong
Evaluate thep-th quantile of the sorted values provided as along.This method can be used when the values of known size are already sorted.
long[] x = ... Arrays.sort(x); StatisticResult q = Quantile.withDefaults() .evaluateAsLong(x.length, i -> x[i], 0.05);Note: It is not recommended to sort data for use only in the quantile computation. The
evaluate(long[], double)method will partially sort the data as required and in most cases will be more efficient.- Parameters:
n- Size of the values.values- Values function.p- Probability for the quantile to compute.- Returns:
- the quantile
- Throws:
IllegalArgumentException- ifsize < 0; or if the probabilitypis not in the range[0, 1].- Since:
- 1.3
-
evaluateAsLong
Evaluate thep-th quantiles of the sorted values provided as along.This method can be used when the values of known size are already sorted.
long[] x = ... Arrays.sort(x); StatisticResult[] q = Quantile.withDefaults() .evaluateAsLong(x.length, i -> x[i], 0.25, 0.5, 0.75);Note: It is not recommended to sort data for use only in the quantile computation. The
evaluate(long[], double...)method will partially sort the data as required and in most cases will be more efficient.- Parameters:
n- Size of the values.values- Values function.p- Probabilities for the quantiles to compute.- Returns:
- the quantiles
- Throws:
IllegalArgumentException- ifsize < 0; if any probabilitypis not in the range[0, 1]; or no probabilities are specified.- Since:
- 1.3
-
checkProbability
private static void checkProbability(double p) Check the probabilitypis in the range[0, 1].- Parameters:
p- Probability for the quantile to compute.- Throws:
IllegalArgumentException- if the probability is not in the range[0, 1]
-
checkProbabilities
private static void checkProbabilities(double... p) Check the probabilitiespare in the range[0, 1].- Parameters:
p- Probabilities for the quantiles to compute.- Throws:
IllegalArgumentException- if any probabilitiespis not in the range[0, 1]; or no probabilities are specified.
-
checkSize
private static void checkSize(int n) Check thesizeis positive.- Parameters:
n- Size of the values.- Throws:
IllegalArgumentException- ifsize < 0
-
checkNumberOfProbabilities
private static void checkNumberOfProbabilities(int n) Check the number of probabilitiesnis strictly positive.- Parameters:
n- Number of probabilities.- Throws:
IllegalArgumentException- ifc < 1
-
computeIndices
private int[] computeIndices(int n, double[] p, double[] q, int offset) Compute the indices required for quantile interpolation.The zero-based interpolation index in
[0, n)is saved into the working arrayqfor eachp.The indices are incremented by the provided
offsetto allow addressing sub-ranges of a larger array.- Parameters:
n- Size of the data.p- Probabilities for the quantiles to compute.q- Working array for quantiles in[0, n).offset- Array offset.- Returns:
- the indices in
[offset, offset + n)
-