How to convert a factor to integer\numeric without loss of information?

See the Warning section of ?factor: In particular, as.numeric applied to a factor is meaningless, and may happen by implicit coercion. To transform a factor f to approximately its original numeric values, as.numeric(levels(f))[f] is recommended and slightly more efficient than as.numeric(as.character(f)). The FAQ on R has similar advice. Why is as.numeric(levels(f))[f] more efficent than as.numeric(as.character(f))? … Read more

Drop data frame columns by name

You can use a simple list of names : DF <- data.frame( x=1:10, y=10:1, z=rep(5,10), a=11:20 ) drops <- c(“x”,”z”) DF[ , !(names(DF) %in% drops)] Or, alternatively, you can make a list of those to keep and refer to them by name : keeps <- c(“y”, “a”) DF[keeps] EDIT : For those still not acquainted … Read more

Remove rows with all or some NAs (missing values) in data.frame

Also check complete.cases : > final[complete.cases(final), ] gene hsap mmul mmus rnor cfam 2 ENSG00000199674 0 2 2 2 2 6 ENSG00000221312 0 1 2 3 2 na.omit is nicer for just removing all NA‘s. complete.cases allows partial selection by including only certain columns of the dataframe: > final[complete.cases(final[ , 5:6]),] gene hsap mmul mmus … Read more

Grouping functions (tapply, by, aggregate) and the *apply family

R has many *apply functions which are ably described in the help files (e.g. ?apply). There are enough of them, though, that beginning useRs may have difficulty deciding which one is appropriate for their situation or even remembering them all. They may have a general sense that “I should be using an *apply function here”, … Read more

How to join (merge) data frames (inner, outer, left, right)

By using the merge function and its optional parameters: Inner join: merge(df1, df2) will work for these examples because R automatically joins the frames by common variable names, but you would most likely want to specify merge(df1, df2, by = “CustomerId”) to make sure that you were matching on only the fields you desired. You … Read more

How to make a great R reproducible example

Basically, a minimal reproducible example (MRE) should enable others to exactly reproduce your issue on their machines. Please do not post images of your data, code, or console output! tl;dr A MRE consists of the following items: a minimal dataset, necessary to demonstrate the problem the minimal runnable code necessary to reproduce the issue, which … Read more