Spark Error:expected zero arguments for construction of ClassDict (for numpy.core.multiarray._reconstruct)

The source of the problem is that object returned from the UDF doesn’t conform to the declared type. np.unique not only returns numpy.ndarray but also converts numerics to the corresponding NumPy types which are not compatible with DataFrame API. You can try something like this: udf(lambda x: list(set(x)), ArrayType(IntegerType())) or this (to keep order) udf(lambda … Read more

Pass table as parameter into sql server UDF

You can, however no any table. From documentation: For Transact-SQL functions, all data types, including CLR user-defined types and user-defined table types, are allowed except the timestamp data type. You can use user-defined table types. Example of user-defined table type: CREATE TYPE TableType AS TABLE (LocationName VARCHAR(50)) GO DECLARE @myTable TableType INSERT INTO @myTable(LocationName) VALUES(‘aaa’) … Read more

SQL Server 2008 – How do i return a User-Defined Table Type from a Table-Valued Function?

Even though you can not return the UDTT from a function, you can return a table variable and receive it in a UDTT as long as the schema match. The following code is tested in SQL Server 2008 R2 — Create the UDTT CREATE TYPE dbo.MyCustomUDDT AS TABLE ( FieldOne varchar (512), FieldTwo varchar(1024) ) … Read more

Apache Spark — Assign the result of UDF to multiple dataframe columns

It is not possible to create multiple top level columns from a single UDF call but you can create a new struct. It requires an UDF with specified returnType: from pyspark.sql.functions import udf from pyspark.sql.types import StructType, StructField, FloatType schema = StructType([ StructField(“foo”, FloatType(), False), StructField(“bar”, FloatType(), False) ]) def udf_test(n): return (n / 2, … Read more

Execute Stored Procedure from a Function

EDIT: I haven’t tried this, so I can’t vouch for it! And you already know you shouldn’t be doing this, so please don’t do it. BUT… Try looking here: http://sqlblog.com/blogs/denis_gobo/archive/2008/05/08/6703.aspx The key bit is this bit which I have attempted to tweak for your purposes: DECLARE @SQL varchar(500) SELECT @SQL = ‘osql -S’ +@@servername +’ … Read more

Spark functions vs UDF performance?

when would a udf be faster If you ask about Python UDF the answer is probably never*. Since SQL functions are relatively simple and are not designed for complex tasks it is pretty much impossible compensate the cost of repeated serialization, deserialization and data movement between Python interpreter and JVM. Does anyone know why this … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)