Valytics Real Estate Data Scrubber/Filter

Situation

Abacus was using Excel macros to perform complex data scrubbing and filtering on large real estate appraisal files (XML). The Valytics data file contains 10,000+ appraisal records and had to be filtered using very complex rules. The data from county record files had to be merged into the Valytics file and reformatted using complex data scrubbing rules. The previous software used Excel macros that were difficult to use, took over five hours to run, could not perform the complex manipulations required, and could handle the extreme volume of data (crashed often).

Action(s)

I created a C# WinForms application to perform the complex data transformations. The application contains hooks for future expansion, a series of checkboxes for the various filters, etc. It merges the data from the Snohomish County files and all operations complete in less than 40 seconds (county files contained over 350,000 records). After the success of the first phase King County data filtering and scrubbing was added and these merge data from 6 separate files with roughly 1,000,000 records. Customer plans to add Pierce County scrubbing/filtering in the future.

Outcome

This application saves many hours of time, performs the complex data manipulations, and updates the data to the on-line appraisal database. Due to the speed and versatility of the application the data can be scrubbed more frequently resulting in more accurate appraisal data. Since the county data changes very often it is advantageous to scrub and filter the data regularly. The application was expanded to accommodate additional counties. The data format varies by country but the application provides consistent rules and output to increase the accuracy of the data.

Notes

This is a relatively simple project when compared with other projects in this list. This quick turnaround project demonstrates my ability to quickly take a task from zero to completion in a short amount of time. The customer was not sure what she wanted or how to accomplish the task. I guided her through the process, provided quick prototypes, and demonstrated how custom software could accomplish these tasks. When the initial project was about ¾ completed more complex data scrubbing requirements were added. The final push to completion presented some problems because the client did not know about the schema rules built into the Valytics XML data. The schema formatting was not in the programming requirements when I stated the project. I had to then add new filters to format many of the other fields to the schema before we could successfully upload the filtered data.

Product Details

  • C#.NET programming language
  • Visual Studio