From the course: Complete Guide to Databricks for Data Engineering

Unlock this course with a free trial

Join today to access over 24,700 courses taught by industry experts.

Handle date manipulation in PySpark

Handle date manipulation in PySpark

- Let's just use our existing data frame and see if we have any date column inside it or not. So let me do this df.printSchema() You'll find that we have one specific column that is registration date, which is of type date. Now when you have a column, which is of date type, you can do a lot of thing from this date column. Now there are many functions rather than importing all of those functions again and again explicitly and just using pyspark.sql.functions and I'm importing everything. So I will say that import * Now let's say for example, I wanted to get the day from that specific registration date. How I can do that, I will say data frame one is equal to df.select "registration_date", I can get an year out of it using an year function and I'll say year from registration date. Similarly, if I want to get the month, I can just say month and I will say registration date. And if I want to get the day of the month, I would say day of the month like this and let's just display this. Now…

Contents