About 50 results
Open links in new tab
  1. Comparison operator in PySpark (not equal/ !=) - Stack Overflow

    Aug 24, 2016 · Comparison operator in PySpark (not equal/ !=) Asked 9 years, 7 months ago Modified 2 years, 1 month ago Viewed 165k times

  2. PySpark: "Exception: Java gateway process exited before sending the ...

    I'm trying to run PySpark on my MacBook Air. When I try starting it up, I get the error: Exception: Java gateway process exited before sending the driver its port number when sc = SparkContext() is

  3. pyspark - How to use AND or OR condition in when in Spark - Stack …

    107 pyspark.sql.functions.when takes a Boolean Column as its condition. When using PySpark, it's often useful to think "Column Expression" when you read "Column". Logical operations on PySpark …

  4. How do I add a new column to a Spark DataFrame (using PySpark)?

    Performance-wise, built-in functions (pyspark.sql.functions), which map to Catalyst expression, are usually preferred over Python user defined functions. If you want to add content of an arbitrary RDD …

  5. pyspark : NameError: name 'spark' is not defined

    Alternatively, you can use the pyspark shell where spark (the Spark session) as well as sc (the Spark context) are predefined (see also NameError: name 'spark' is not defined, how to solve?).

  6. pyspark - Adding a dataframe to an existing delta table throws DELTA ...

    Jun 9, 2024 · Fix Issue was due to mismatched data types. Explicitly declaring schema type resolved the issue. schema = StructType([ StructField("_id", StringType(), True), StructField("

  7. How to change dataframe column names in PySpark?

    I come from pandas background and am used to reading data from CSV files into a dataframe and then simply changing the column names to something useful using the simple command: df.columns =

  8. Show distinct column values in pyspark dataframe - Stack Overflow

    With pyspark dataframe, how do you do the equivalent of Pandas df['col'].unique(). I want to list out all the unique values in a pyspark dataframe column. Not the SQL type way (registertemplate the...

  9. How to find count of Null and Nan values for each column in a PySpark ...

    Jun 19, 2017 · How to find count of Null and Nan values for each column in a PySpark dataframe efficiently? Asked 8 years, 9 months ago Modified 3 years ago Viewed 292k times

  10. Newest 'pyspark' Questions - Stack Overflow

    Mar 24, 2026 · Stack Overflow | The World’s Largest Online Community for Developers