topK
topK
Introduced in: v1.1
Returns an array of the approximately most frequent values in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves).
Implements the Filtered Space-Saving algorithm for analyzing TopK, based on the reduce-and-combine algorithm from Parallel Space Saving.
This function does not provide a guaranteed result. In certain situations, errors might occur and it might return frequent values that aren't the most frequent values.
See Also
Syntax
Parameters
N— The number of elements to return. Default value: 10. Maximum value ofN = 65536.UInt64load_factor— Optional. Defines, how many cells reserved for values. Ifuniq(column) > N * load_factor, result of topK function will be approximate. Default value: 3.UInt64counts— Optional. Defines whether the result should contain an approximate count and error value.Bool
Arguments
column— The name of the column for which to find the most frequent values.String
Returned value
Returns an array of the approximately most frequent values, sorted in descending order of approximate frequency. Array
Examples
Usage example
See Also