Filter rows which contain a certain string

The answer to the question was already posted by the @latemail in the comments above. You can use regular expressions for the second and subsequent arguments of filter like this: dplyr::filter(df, !grepl(“RTB”,TrackingPixel)) Since you have not provided the original data, I will add a toy example using the mtcars data set. Imagine you are only … Read more

Can dplyr join on multiple columns or composite key?

Updating to use tibble() You can pass a named vector of length greater than 1 to the by argument of left_join(): library(dplyr) d1 <- tibble( x = letters[1:3], y = LETTERS[1:3], a = rnorm(3) ) d2 <- tibble( x2 = letters[3:1], y2 = LETTERS[3:1], b = rnorm(3) ) left_join(d1, d2, by = c(“x” = “x2”, … Read more

How to select the rows with maximum values in each group with dplyr? [duplicate]

Try this: result <- df %>% group_by(A, B) %>% filter(value == max(value)) %>% arrange(A,B,C) Seems to work: identical( as.data.frame(result), ddply(df, .(A, B), function(x) x[which.max(x$value),]) ) #[1] TRUE As pointed out in the comments, slice may be preferred here as per @RoyalITS’ answer below if you strictly only want 1 row per group. This answer will … Read more

Sum across multiple columns with dplyr

dplyr >= 1.0.0 using across sum up each row using rowSums (rowwise works for any aggreation, but is slower) df %>% replace(is.na(.), 0) %>% mutate(sum = rowSums(across(where(is.numeric)))) sum down each column df %>% summarise(across(everything(), ~ sum(., is.na(.), 0))) dplyr < 1.0.0 sum up each row df %>% replace(is.na(.), 0) %>% mutate(sum = rowSums(.[1:5])) sum down … Read more

Group by multiple columns in dplyr, using string vector input

Just so as to write the code in full, here’s an update on Hadley’s answer with the new syntax: library(dplyr) df <- data.frame( asihckhdoydk = sample(LETTERS[1:3], 100, replace=TRUE), a30mvxigxkgh = sample(LETTERS[1:3], 100, replace=TRUE), value = rnorm(100) ) # Columns you want to group by grp_cols <- names(df)[-3] # Convert character vector to list of symbols … Read more

Remove duplicated rows using dplyr

Here is a solution using dplyr >= 0.5. library(dplyr) set.seed(123) df <- data.frame( x = sample(0:1, 10, replace = T), y = sample(0:1, 10, replace = T), z = 1:10 ) > df %>% distinct(x, y, .keep_all = TRUE) x y z 1 0 1 1 2 1 0 2 3 1 1 4

How to interpret dplyr message `summarise()` regrouping output by ‘x’ (override with `.groups` argument)?

It is just a friendly warning message. By default, if there is any grouping before the summarise, it drops one group variable i.e. the last one specified in the group_by. If there is only one grouping variable, there won’t be any grouping attribute after the summarise and if there are more than one i.e. here … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)