PERCENTILE_CONT & PERCENTILE_DISC Analytic Functions in Oracle SQL
PERCENTILE_CONT is an inverse distribution function that assumes a continuous distribution model.
A continuous distribution describes the probabilities of the possible values of a continuous random variable.
A continuous random variable is a random variable with a set of possible values (known as the range) that is infinite and uncountable.
An inverse distribution is the distribution of the reciprocal of a random variable.
PERCENTILE_DISC is an inverse distribution function that assumes a discrete distribution model.
A discrete distribution describes the probability of occurrence of each value of a discrete random variable.
A discrete random variable is a random variable that has countable values, such as a list of non-negative integers.
Example as aggregate function
computes the median salary in each department:
Note: At 0.5 value PERCENTILE_CONT works as MEDIAN function.
Even Number: PERCENTILE_CONT returns the average of the two middle values whereas PERCENTILE_DISC returns the value of the first one among the two middle values.
OLD Number: both functions return the value of the middle element.
SELECT department_id,sum(salary),
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary DESC) "Median cont",
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY salary DESC) "Median disc"
FROM employees
GROUP BY department_id
ORDER BY department_id;
DEPARTMENT_ID SUM(SALARY) Median cont Median disc ------------- ----------- ----------- ----------- 10 4400 4400 4400 20 19000 9500 13000 30 21800 2800 2800 40 6500 6500 6500 50 156400 3100 3100 60 28800 4800 4800 70 10000 10000 10000 80 304500 8900 9000 90 58000 17000 17000 100 51608 8000 8200 110 20308 10154 12008
---Manually calculate median:
SELECT department_id,salary,
MEDIAN(salary) OVER (PARTITION BY department_id) "Medianvalue"
FROM employees
ORDER BY department_id;
DEPARTMENT_ID SALARY Medianvalue ------------- ---------- ----------- 10 4400 4400 20 6000 9500 20 13000 9500 30 2500 2800 30 2600 2800 30 2800 2800 30 2900 2800 30 11000 2800
Example to use of PERCENTILE_CONT Analytic function
Exampl show the median for Department 60 is 4800, which has a corresponding percentile (Percent_Rank) of 0.5.
None of the salaries in Department 30 have a percentile of 0.5,
so the median value must be interpolated between 2900 (percentile 0.4) and 2800 (percentile 0.6), which evaluates to 2850.
SELECT last_name, salary, department_id,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary DESC)
OVER (PARTITION BY department_id) "Percentile_Cont",
PERCENT_RANK()
OVER (PARTITION BY department_id ORDER BY salary DESC) "Percent_Rank"
FROM employees
WHERE department_id IN (30, 60)
ORDER BY last_name, salary, department_id;
LAST_NAME SALARY DEPARTMENT_ID Percentile_Cont Percent_Rank ---------- ---------- ------------- --------------- ------------ Austin 4800 60 4800 .5 Baida 2900 30 2800 .4 Colmenares 2500 30 2800 1 Ernst 6000 60 4800 .25 Himuro 2600 30 2800 .8 Hunold 9000 60 4800 0 Khoo 30 2800 0 Lorentz 4200 60 4800 1 Pataballa 4800 60 4800 .5 Raphaely 11000 30 2800 .2 Tobias 2800 30 2800 .6
Example to use of PERCENTILE_DISC Analytic function
Example calculates the median discrete percentile of the salary of each employee in the sample table hr.employees.
SELECT last_name, salary, department_id,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY salary DESC)
OVER (PARTITION BY department_id) "Percentile_Disc",
CUME_DIST() OVER (PARTITION BY department_id
ORDER BY salary DESC) "Cume_Dist"
FROM employees
WHERE department_id in (30, 60)
ORDER BY last_name, salary, department_id;
LAST_NAME SALARY DEPARTMENT_ID Percentile_Disc Cume_Dist ---------- ---------- ------------- --------------- ---------- Austin 4800 60 4800 .8 Baida 2900 30 2800 .5 Colmenares 2500 30 2800 1 Ernst 6000 60 4800 .4 Himuro 2600 30 2800 .833333333 Hunold 9000 60 4800 .2 Khoo 30 2800 .166666667 Lorentz 4200 60 4800 1 Pataballa 4800 60 4800 .8 Raphaely 11000 30 2800 .333333333 Tobias 2800 30 2800 .666666667