Data Preprocessing For Unstructured Data Sources