About 9,620,000 results
Open links in new tab
  1. python - PySpark: "Exception: Java gateway process exited before ...

    I'm trying to run PySpark on my MacBook Air. When I try starting it up, I get the error: Exception: Java gateway process exited before sending the driver its port number when sc = …

  2. python - Spark Equivalent of IF Then ELSE - Stack Overflow

    python apache-spark pyspark apache-spark-sql edited Dec 10, 2017 at 1:43 Community Bot 1 1

  3. pyspark - How to use AND or OR condition in when in Spark

    107 pyspark.sql.functions.when takes a Boolean Column as its condition. When using PySpark, it's often useful to think "Column Expression" when you read "Column". Logical operations on …

  4. pyspark : NameError: name 'spark' is not defined

    Alternatively, you can use the pyspark shell where spark (the Spark session) as well as sc (the Spark context) are predefined (see also NameError: name 'spark' is not defined, how to solve?).

  5. python - Concatenate two PySpark dataframes - Stack Overflow

    Utilize simple unionByName method in pyspark, which concats 2 dataframes along axis 0 as done by pandas concat method. Now suppose you have df1 with columns id, uniform, normal and …

  6. Pyspark: get list of files/directories on HDFS path

    Mar 2, 2016 · As per title. I'm aware of textFile but, as the name suggests, it works only on text files. I would need to access files/directories inside a path on either HDFS or a local path. I'm …

  7. Comparison operator in PySpark (not equal/ !=) - Stack Overflow

    Aug 24, 2016 · Comparison operator in PySpark (not equal/ !=) Asked 9 years, 2 months ago Modified 1 year, 8 months ago Viewed 164k times

  8. How do I replace a string value with a NULL in PySpark?

    Mar 7, 2023 · I want to do something like this: df.replace('empty-value', None, 'NAME') Basically, I want to replace some value with NULL, but it does not accept None as an argument. How can …

  9. Pyspark: Replacing value in a column by searching a dictionary

    May 15, 2017 · @AliAzG is there a way to Remove those rows from a pyspark dataframe whose entries from a column [of the pyspark] are not present in a dictionary's list of keys?

  10. Pyspark: Parse a column of json strings - Stack Overflow

    I have a pyspark dataframe consisting of one column, called json, where each row is a unicode string of json. I'd like to parse each row and return a new dataframe where each row is the …