Skip to main content
Skip to main content

quantileTDigestWeighted

quantileTDigestWeighted

Introduced in: v20.1

Computes an approximate quantile of a numeric data sequence using the t-digest algorithm. The function takes into account the weight of each sequence member.

The maximum error is 1%. Memory consumption is log(n), where n is a number of values.

The performance of the function is lower than performance of quantile or quantileTiming. In terms of the ratio of State size to precision, this function is much better than quantile.

The result depends on the order of running the query, and is nondeterministic.

When using multiple quantile* functions with different levels in a query, the internal states are not combined (that is, the query works less efficiently than it could). In this case, use the quantiles function.

Note

Using quantileTDigestWeighted is not recommended for tiny data sets and can lead to significant error. In this case, consider possibility of using quantileTDigest instead.

Syntax

quantileTDigestWeighted(level)(expr, weight)

Aliases: medianTDigestWeighted

Parameters

  • level — Optional. Level of quantile. Constant floating-point number from 0 to 1. We recommend using a level value in the range of [0.01, 0.99]. Default value: 0.5. At level=0.5 the function calculates median. Float*

Arguments

  • expr — Expression over the column values resulting in numeric data types, Date or DateTime. (U)Int* or Float* or Decimal* or Date or DateTime
  • weight — Column with weights of sequence elements. Weight is a number of value occurrences. UInt*

Returned value

Approximate quantile of the specified level. Float64 or Date or DateTime

Examples

Computing weighted quantile with t-digest

SELECT quantileTDigestWeighted(number, 1) FROM numbers(10);
┌─quantileTDigestWeighted(number, 1)─┐
│                                4.5 │
└────────────────────────────────────┘

See Also