Thanks for sharing this! The context-based approach is interesting. Maintaining state from past successful parses to resolve ambiguity is essentially what we do too, though we batch it (scan first 50 values) rather than doing it incrementally. The incremental approach has the advantage of adapting mid-column if formats shift, which we've seen in datasets that were manually concatenated from different sources.
That's the ideal, but unfortunately not always an option when you're on the receiving end. We're building a data cleaning tool, so the whole point is dealing with messy user-uploaded CSVs where we don't control the export format. If we could mandate ISO 8601 everywhere, life would be much simpler. But the reality is people copy-paste from Excel, export from legacy systems, or hand-edit CSVs, and we need to handle what shows up.
A known issue.. have faced this personally long back (had to let it go back then since the use-cases was no more valid) ... but, will this help?
https://github.com/freakynit/smart-date-parser
This does maintain context based on past successful parses.
Disclaimer: This is fully opus generated, but do have test cases (in usage.js ... i know.. it's not what it's for.. but it is what it is).
Thanks for sharing this! The context-based approach is interesting. Maintaining state from past successful parses to resolve ambiguity is essentially what we do too, though we batch it (scan first 50 values) rather than doing it incrementally. The incremental approach has the advantage of adapting mid-column if formats shift, which we've seen in datasets that were manually concatenated from different sources.
Will take a look at the repo.
All thanks to Opus :)
Localization for ui only. Im- and exported data only in standard formats
That's the ideal, but unfortunately not always an option when you're on the receiving end. We're building a data cleaning tool, so the whole point is dealing with messy user-uploaded CSVs where we don't control the export format. If we could mandate ISO 8601 everywhere, life would be much simpler. But the reality is people copy-paste from Excel, export from legacy systems, or hand-edit CSVs, and we need to handle what shows up.