
This project developed under the guidance of Alex Freberg. In this project we process raw housing data using SQL Server, enhancing its usability for analysis.
Download the dataset through this link.
We standardize date-time information using the CONVERT function.
Additionally, we populate missing property addresses based on their parcel IDs.
By dividing address components (such as city and state) using SUBSTRING and PARSENAME, we simplify subsequent analysis.
Furthermore, we transform ‘Y’ and ‘N’ values in the ‘Sold As Vacant’ field to ‘Yes’ and ‘No’ using a CASE statement.
Finally, we remove duplicates and eliminate unused columns from the dataset using CTE and PARTITION BY.
