Change value of variable with dplyr
We can use replace to change the values in ‘mpg’ to NA that corresponds to cyl==4. mtcars %>% mutate(mpg=replace(mpg, cyl==4, NA)) %>% as.data.frame()
We can use replace to change the values in ‘mpg’ to NA that corresponds to cyl==4. mtcars %>% mutate(mpg=replace(mpg, cyl==4, NA)) %>% as.data.frame()
The answer to the question was already posted by the @latemail in the comments above. You can use regular expressions for the second and subsequent arguments of filter like this: dplyr::filter(df, !grepl(“RTB”,TrackingPixel)) Since you have not provided the original data, I will add a toy example using the mtcars data set. Imagine you are only … Read more
Updating to use tibble() You can pass a named vector of length greater than 1 to the by argument of left_join(): library(dplyr) d1 <- tibble( x = letters[1:3], y = LETTERS[1:3], a = rnorm(3) ) d2 <- tibble( x2 = letters[3:1], y2 = LETTERS[3:1], b = rnorm(3) ) left_join(d1, d2, by = c(“x” = “x2”, … Read more
Try this: result <- df %>% group_by(A, B) %>% filter(value == max(value)) %>% arrange(A,B,C) Seems to work: identical( as.data.frame(result), ddply(df, .(A, B), function(x) x[which.max(x$value),]) ) #[1] TRUE As pointed out in the comments, slice may be preferred here as per @RoyalITS’ answer below if you strictly only want 1 row per group. This answer will … Read more
dplyr >= 1.0.0 using across sum up each row using rowSums (rowwise works for any aggreation, but is slower) df %>% replace(is.na(.), 0) %>% mutate(sum = rowSums(across(where(is.numeric)))) sum down each column df %>% summarise(across(everything(), ~ sum(., is.na(.), 0))) dplyr < 1.0.0 sum up each row df %>% replace(is.na(.), 0) %>% mutate(sum = rowSums(.[1:5])) sum down … Read more
Just so as to write the code in full, here’s an update on Hadley’s answer with the new syntax: library(dplyr) df <- data.frame( asihckhdoydk = sample(LETTERS[1:3], 100, replace=TRUE), a30mvxigxkgh = sample(LETTERS[1:3], 100, replace=TRUE), value = rnorm(100) ) # Columns you want to group by grp_cols <- names(df)[-3] # Convert character vector to list of symbols … Read more
In dplyr (>=1.00) you may use across(everything() in summarise to apply a function to all variables: library(dplyr) df %>% group_by(grp) %>% summarise(across(everything(), list(mean))) #> # A tibble: 3 x 5 #> grp a b c d #> <int> <dbl> <dbl> <dbl> <dbl> #> 1 1 3.08 2.98 2.98 2.91 #> 2 2 3.03 3.04 2.97 … Read more
Here is a solution using dplyr >= 0.5. library(dplyr) set.seed(123) df <- data.frame( x = sample(0:1, 10, replace = T), y = sample(0:1, 10, replace = T), z = 1:10 ) > df %>% distinct(x, y, .keep_all = TRUE) x y z 1 0 1 1 2 1 0 2 3 1 1 4
There is probably a faster way: df %>% group_by(id) %>% arrange(stopSequence) %>% filter(row_number()==1 | row_number()==n())
It is just a friendly warning message. By default, if there is any grouping before the summarise, it drops one group variable i.e. the last one specified in the group_by. If there is only one grouping variable, there won’t be any grouping attribute after the summarise and if there are more than one i.e. here … Read more