distinct-values – Tarik Billa

Selecting distinct 2 columns combination in mysql

September 19, 2023 by Tarik

Update 1 Better you use this against above. SELECT id, col2, col3, col4 FROM yourtable GROUP BY col2, col3; Demo The reason I am saying is because using CONCAT, I am not getting desired result in this case. First query is returning me 5 rows however CONCAT is returning me 4 rows which is INCORRECT. … Read more

Spark DataFrame: count distinct values of every column

July 21, 2023 by Tarik

In pySpark you could do something like this, using countDistinct(): from pyspark.sql.functions import col, countDistinct df.agg(*(countDistinct(col(c)).alias(c) for c in df.columns)) Similarly in Scala : import org.apache.spark.sql.functions.countDistinct import org.apache.spark.sql.functions.col df.select(df.columns.map(c => countDistinct(col(c)).alias(c)): _*) If you want to speed things up at the potential loss of accuracy, you could also use approxCountDistinct().

pyspark: count distinct over a window

July 19, 2023 by Tarik

EDIT: as noleto mentions in his answer below, there is now approx_count_distinct available since PySpark 2.1 that works over a window. Original answer – exact distinct count (not an approximation) We can use a combination of size and collect_set to mimic the functionality of countDistinct over a window: from pyspark.sql import functions as F, Window … Read more

select distinct items from a column in powershell

April 3, 2023 by Tarik

Have you tried something like this? get-command | select ModuleName | sort-object -Property ModuleName -Unique

Counting unique / distinct values by group in a data frame

January 15, 2023 by Tarik

A data.table approach library(data.table) DT <- data.table(myvec) DT[, .(number_of_distinct_orders = length(unique(order_no))), by = name] data.table v >= 1.9.5 has a built in uniqueN function now DT[, .(number_of_distinct_orders = uniqueN(order_no)), by = name]

List distinct values in a vector in R

December 26, 2022 by Tarik

Do you mean unique: R> x = c(1,1,2,3,4,4,4) R> x [1] 1 1 2 3 4 4 4 R> unique(x) [1] 1 2 3 4

Java 8 Distinct by property

September 10, 2022 by Tarik

Consider distinct to be a stateful filter. Here is a function that returns a predicate that maintains state about what it’s seen previously, and that returns whether the given element was seen for the first time: public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) { Set<Object> seen = ConcurrentHashMap.newKeySet(); return t -> seen.add(keyExtractor.apply(t)); } … Read more