PERCENTILE_CONT & PERCENTILE_DISC Analytic Functions in Oracle SQL

PERCENTILE_CONT & PERCENTILE_DISC Analytic Functions in Oracle SQL

PERCENTILE_CONT is an inverse distribution function that assumes a continuous distribution model.
A continuous distribution describes the probabilities of the possible values of a continuous random variable.
A continuous random variable is a random variable with a set of possible values (known as the range) that is infinite and uncountable.
An inverse distribution is the distribution of the reciprocal of a random variable.

PERCENTILE_DISC is an inverse distribution function that assumes a discrete distribution model.
A discrete distribution describes the probability of occurrence of each value of a discrete random variable.
A discrete random variable is a random variable that has countable values, such as a list of non-negative integers.

Example as aggregate function
computes the median salary in each department:

Note: At 0.5 value PERCENTILE_CONT works as MEDIAN function.
Even Number: PERCENTILE_CONT returns the average of the two middle values whereas PERCENTILE_DISC returns the value of the first one among the two middle values.
OLD Number: both functions return the value of the middle element.

SELECT department_id,sum(salary),
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary DESC) "Median cont",
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY salary DESC) "Median disc"
FROM employees
GROUP BY department_id
ORDER BY department_id;

DEPARTMENT_ID SUM(SALARY) Median cont Median disc
------------- ----------- ----------- -----------
           10        4400        4400        4400
           20       19000        9500       13000
           30       21800        2800        2800
           40        6500        6500        6500
           50      156400        3100        3100
           60       28800        4800        4800
           70       10000       10000       10000
           80      304500        8900        9000
           90       58000       17000       17000
          100       51608        8000        8200
          110       20308       10154       12008

---Manually calculate median:
SELECT department_id,salary,
MEDIAN(salary) OVER (PARTITION BY department_id) "Medianvalue"
FROM employees
ORDER BY department_id;

DEPARTMENT_ID     SALARY Medianvalue
------------- ---------- -----------
           10       4400        4400
           20       6000        9500
           20      13000        9500
           30       2500        2800
           30       2600        2800
           30       2800        2800
           30       2900        2800
           30      11000        2800

Example to use of PERCENTILE_CONT Analytic function
Exampl show the median for Department 60 is 4800, which has a corresponding percentile (Percent_Rank) of 0.5.
None of the salaries in Department 30 have a percentile of 0.5,
so the median value must be interpolated between 2900 (percentile 0.4) and 2800 (percentile 0.6), which evaluates to 2850.

SELECT last_name, salary, department_id,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary DESC)
OVER (PARTITION BY department_id) "Percentile_Cont",
PERCENT_RANK()
OVER (PARTITION BY department_id ORDER BY salary DESC) "Percent_Rank"
FROM employees
WHERE department_id IN (30, 60)
ORDER BY last_name, salary, department_id;

LAST_NAME      SALARY DEPARTMENT_ID Percentile_Cont Percent_Rank
---------- ---------- ------------- --------------- ------------
Austin           4800            60            4800           .5
Baida            2900            30            2800           .4
Colmenares       2500            30            2800            1
Ernst            6000            60            4800          .25
Himuro           2600            30            2800           .8
Hunold           9000            60            4800            0
Khoo                             30            2800            0
Lorentz          4200            60            4800            1
Pataballa        4800            60            4800           .5
Raphaely        11000            30            2800           .2
Tobias           2800            30            2800           .6

Example to use of PERCENTILE_DISC Analytic function
Example calculates the median discrete percentile of the salary of each employee in the sample table hr.employees.

SELECT last_name, salary, department_id,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY salary DESC)
OVER (PARTITION BY department_id) "Percentile_Disc",
CUME_DIST() OVER (PARTITION BY department_id
ORDER BY salary DESC) "Cume_Dist"
FROM employees
WHERE department_id in (30, 60)
ORDER BY last_name, salary, department_id;

LAST_NAME      SALARY DEPARTMENT_ID Percentile_Disc  Cume_Dist
---------- ---------- ------------- --------------- ----------
Austin           4800            60            4800         .8
Baida            2900            30            2800         .5
Colmenares       2500            30            2800          1
Ernst            6000            60            4800         .4
Himuro           2600            30            2800 .833333333
Hunold           9000            60            4800         .2
Khoo                             30            2800 .166666667
Lorentz          4200            60            4800          1
Pataballa        4800            60            4800         .8
Raphaely        11000            30            2800 .333333333
Tobias           2800            30            2800 .666666667

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.