PySpark: modify column values when another column value satisfies a condition
You can use when and otherwise like – from pyspark.sql.functions import * df\ .withColumn(‘Id_New’,when(df.Rank <= 5,df.Id).otherwise(‘other’))\ .drop(df.Id)\ .select(col(‘Id_New’).alias(‘Id’),col(‘Rank’))\ .show() this gives output as – +—–+—-+ | Id|Rank| +—–+—-+ | a| 5| |other| 7| |other| 8| | d| 1| +—–+—-+