The SDL Component Suite is an industry leading collection of components supporting scientific and engineering computing. Please visit the SDL Web site for more information....



Percentile


Unit: SDL_matrix
Class: TMat4D
Declaration: [1] function Percentile (prob: double; LowCol, HighCol, LowRow, HighRow, LowLayer, HighLayer, LowTimeSlot, HighTimeSlot: integer): double;
[2] function Percentile (prob: double; LowCol, HighCol, LowRow, HighRow, LowLayer, HighLayer, LowTimeSlot, HighTimeSlot: integer; SampleSize: integer): double;
[3] function Percentile (prob: double; LowCol, HighCol, LowRow, HighRow, LowLayer, HighLayer, LowTimeSlot, HighTimeSlot: integer; SampleSize: integer; AbundantVal: double): double;

The function Percentile returns the percentile with the probability of prob percent. The calculation of the percentile is based on QuickSelect and includes all matrix elements between (and including) LowCol, LowRow, HighCol, HighRow, LowLayer, HighLayer, LowTimeSlot, and HighTimeSlot. The function returns a zero value if any error occurs.

Version [1] of the function performs an exact calculation, version [2] calculates the percentile of a random sample taken from the specified data range. The size of the random sample is controlled by the parameter SampleSize. If SampleSize is zero, version [1] is automatically executed. If SampleSize is larger than the number of elements in the specified data range, oversampling occurs (which does no harm to the result but is slower than version [1]). The big advantage of version [2] is that it is much faster than version [1] if the specified data range is large. Typically, version [2] should be used if the number of specified elements exceeds 50000 cells, using a sample size of 10000.

If the data matrix contains a high number (more than 50%) of equal values versions [1] and [2] become prohibitively slow at high prob values. In this case version [3] can be used to speed up the calculation by specifying the abundant value by the parameter AbundantVal. Please note that the speed of Percentile depends on the distribution of the values, the number of data values and on the parameter prob. Thus it is recommended to test whether version [1], [2] or [3] performs best in a particular situation.

Please note that the 50%-percentile (prob = 50) is also known as the median of the distribution.

Hint 1: Setting both the low and high parameter of a dimension (i.e. LowCol and HighCol) to zero values forces the method to use all elements of that dimension.

Hint 2: The calculated percentile is not interpolated and simply returns the closest value of the distribution, which may differ from other statistical packages and from the results obtained by TMatrix.Percentile and TVector.Percentile if the number of included values is low.



Last Update: 2023-Jul-30