My code (following the description of your first method) runs normally in Spark 1.4.0-SNAPSHOT
on these two configurations:
Intellij IDEA's test
Spark Standalone cluster
with 8 nodes (1 master, 7 worker)
Please check if any differences exists
val bc = sc.broadcast(Array[String]("login3", "login4"))
val x = Array(("login1", 192), ("login2", 146), ("login3", 72))
val xdf = sqlContext.createDataFrame(x).toDF("name", "cnt")
val func: (String => Boolean) = (arg: String) => bc.value.contains(arg)
val sqlfunc = udf(func)
val filtered = xdf.filter(sqlfunc(col("name")))
xdf.show()
filtered.show()
Output
name cnt
login1 192
login2 146
login3 72name cnt
login3 72