## 9.4 Linear Regression Aggregate Functions

Linear regression functions are useful for trend line continuation. The trend or regression line is usually a pattern followed by a set of values. Linear regression is useful to predict future values. To continue the regression line, you need to know the slope and the point of intersection with the y-axis. As set of linear functions can be used for calculating these values.

In the function syntax, y is interpreted as an x-dependent variable.

The linear regression aggregate functions take a pair of arguments, the dependent variable expression (y) and the independent variable expression (x), which are both numeric value expressions. Any row in which either argument evaluates to `NULL` is removed from the rows that qualify. If there are no rows that qualify, then the result of `REGR_COUNT` is `0` (zero), and the other linear regression aggregate functions result in `NULL`.

### 9.4.1 `REGR_AVGX()`

Available inDSQL, PSQL

Result type`DOUBLE PRECISION`

Syntax

````REGR_AVGX ( <y>, <x> )`
```

The function `REGR_AVGX` calculates the average of the independent variable (x) of the regression line.

The function `REGR_AVGX(<y>, <x>)` is equivalent to

````SUM(<exprX>) / REGR_COUNT(<y>, <x>)`
` `
`<exprX> :==`
`  CASE WHEN <x> IS NOT NULL AND <y> IS NOT NULL THEN <x> END`
```

### 9.4.2 `REGR_AVGY()`

Available inDSQL, PSQL

Result type`DOUBLE PRECISION`

Syntax

````REGR_AVGY ( <y>, <x> )`
```

The function `REGR_AVGY` calculates the average of the dependent variable (y) of the regression line.

The function `REGR_AVGY(<y>, <x>)` is equivalent to

````SUM(<exprY>) / REGR_COUNT(<y>, <x>)`
` `
`<exprY> :==`
`  CASE WHEN <x> IS NOT NULL AND <y> IS NOT NULL THEN <y> END`
```

### 9.4.3 `REGR_COUNT()`

Available inDSQL, PSQL

Result type`DOUBLE PRECISION`

Syntax

````REGR_COUNT ( <y>, <x> )`
```

The function `REGR_COUNT` counts the number of non-empty pairs of the regression line.

The function `REGR_COUNT(<y>, <x>)` is equivalent to

````COUNT(*) FILTER (WHERE <x> IS NOT NULL AND <y> IS NOT NULL)`
```

See alsoSection 9.2.2, `COUNT()`

### 9.4.4 `REGR_INTERCEPT()`

Available inDSQL, PSQL

Result type`DOUBLE PRECISION`

Syntax

````REGR_INTERCEPT ( <y>, <x> )`
```

The function `REGR_INTERCEPT` calculates the point of intersection of the regression line with the y-axis.

The function `REGR_INTERCEPT(<y>, <x>)` is equivalent to

````REGR_AVGY(<y>, <x>) - REGR_SLOPE(<y>, <x>) * REGR_AVGX(<y>, <x>)`
```

#### 9.4.4.1 `REGR_INTERCEPT` Examples

Forecasting sales volume

````with recursive years (byyear) as (`
`  select 1991`
`  from rdb\$database`
`  union all`
`  select byyear + 1`
`  from years`
`  where byyear < 2020`
`),`
`s as (`
`  select`
`    extract(year from order_date) as byyear,`
`    sum(total_value) as total_value`
`  from sales`
`  group by 1`
`),`
`regr as (`
`  select`
`    regr_intercept(total_value, byyear) as intercept,`
`    regr_slope(total_value, byyear) as slope`
`  from s`
`)`
`select`
`  years.byyear as byyear,`
`  intercept + (slope * years.byyear) as total_value`
`from years`
`cross join regr`
```
````BYYEAR TOTAL_VALUE`
`------ ------------`
`  1991    118377.35`
`  1992    414557.62`
`  1993    710737.89`
`  1994   1006918.16`
`  1995   1303098.43`
`  1996   1599278.69`
`  1997   1895458.96`
`  1998   2191639.23`
`  1999   2487819.50`
`  2000   2783999.77`
`...`
```

### 9.4.5 `REGR_R2()`

Available inDSQL, PSQL

Result type`DOUBLE PRECISION`

Syntax

````REGR_R2 ( <y>, <x> )`
```

The REGR_R2 function calculates the coefficient of determination, or R-squared, of the regression line.

The function `REGR_R2(<y>, <x>)` is equivalent to

````POWER(CORR(<y>, <x>), 2)`
```

### 9.4.6 `REGR_SLOPE()`

Available inDSQL, PSQL

Result type`DOUBLE PRECISION`

Syntax

````REGR_SLOPE ( <y>, <x> )`
```

The function `REGR_SLOPE` calculates the slope of the regression line.

The function `REGR_SLOPE(<y>, <x>)` is equivalent to

````COVAR_POP(<y>, <x>) / VAR_POP(<exprX>)`
` `
`<exprX> :==`
`  CASE WHEN <x> IS NOT NULL AND <y> IS NOT NULL THEN <x> END`
```

### 9.4.7 `REGR_SXX()`

Available inDSQL, PSQL

Result type`DOUBLE PRECISION`

Syntax

````REGR_SXX ( <y>, <x> )`
```

The function `REGR_SXX` calculates the sum of squares of the independent expression variable (x).

The function `REGR_SXX(<y>, <x>)` is equivalent to

````REGR_COUNT(<y>, <x>) * VAR_POP(<exprX>)`
` `
`<exprX> :==`
`  CASE WHEN <x> IS NOT NULL AND <y> IS NOT NULL THEN <x> END`
```

### 9.4.8 `REGR_SXY()`

Available inDSQL, PSQL

Result type`DOUBLE PRECISION`

Syntax

````REGR_SXY ( <y>, <x> )`
```

The function `REGR_SXY` calculates the sum of products of independent variable expression (x) times dependent variable expression (y).

The function `REGR_SXY(<y>, <x>)` is equivalent to

````REGR_COUNT(<y>, <x>) * COVAR_POP(<y>, <x>)`
```

### 9.4.9 `REGR_SYY()`

Available inDSQL, PSQL

Result type`DOUBLE PRECISION`

Syntax

````REGR_SYY ( <y>, <x> )`
```

The function `REGR_SYY` calculates the sum of squares of the dependent variable (y).

The function `REGR_SYY(<y>, <x>)` is equivalent to

````REGR_COUNT(<y>, <x>) * VAR_POP(<exprY>)`
` `
`<exprY> :==`
`  CASE WHEN <x> IS NOT NULL AND <y> IS NOT NULL THEN <y> END`
```