fix: Apply PREFER_DATES_FROM logic to custom date formats (Fixes #445)#1342
fix: Apply PREFER_DATES_FROM logic to custom date formats (Fixes #445)#1342adnan-awan wants to merge 3 commits into
Conversation
Fixes #445 - settings.PREFER_DATES_FROM setting now correctly applies when date_formats are explicitly specified with 2-digit year formats (%y). Previously, when date_formats were provided, the parser would bypass the PREFER_DATES_FROM logic and directly use strptime(), causing ambiguous 2-digit years to be interpreted in the wrong century. Changes: - Enhanced parse_with_formats() to detect 2-digit year formats (%y) - Apply year adjustment logic based on PREFER_DATES_FROM setting - Added comprehensive test coverage for date_formats + PREFER_DATES_FROM Test results: - All 24,056 existing tests pass - Added 5 new test cases covering various scenarios
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1342 +/- ##
==========================================
+ Coverage 97.11% 97.12% +0.01%
==========================================
Files 235 235
Lines 2909 2924 +15
==========================================
+ Hits 2825 2840 +15
Misses 84 84 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
| else: | ||
| # Apply PREFER_DATES_FROM logic for 2-digit year formats (%y) | ||
| if "%y" in date_format and "%Y" not in date_format: |
| else: | ||
| # Apply PREFER_DATES_FROM logic for 2-digit year formats (%y) | ||
| if "%y" in date_format and "%Y" not in date_format: | ||
| now = datetime.today() |
There was a problem hiding this comment.
Sonnet says we are not supposed to be using datetime.today() here:
The NLP path in parser.py:434-436 computes self.now from settings.RELATIVE_BASE, and uses that for the comparison. The PR uses datetime.today() unconditionally. This means RELATIVE_BASE is silently ignored in the date_formats path — an inconsistency that would surprise users who set a custom base for testing or time-shifted parsing
Could you check?
| if "%y" in date_format and "%Y" not in date_format: | ||
| now = datetime.today() | ||
| if now < date_obj: | ||
| if "past" in settings.PREFER_DATES_FROM: |
There was a problem hiding this comment.
I see it done somewhere else in the code base, but I wonder why we use in to compare here. It is a string, not a set, maybe == is the right call?
…closes #445) When date_formats is provided, parse_with_formats() bypassed the PREFER_DATES_FROM setting entirely, always returning future-century dates for 2-digit year formats (%y). Changes: - Apply PREFER_DATES_FROM logic in parse_with_formats() for 2-digit year formats: subtract 100 years when preference is 'past' and the parsed date is in the future; add 100 years when preference is 'future' and the parsed date is in the past - Respect settings.RELATIVE_BASE as the reference point (matching the NLP path in parser.py), falling back to datetime.now(tz=utc) when unset - Use == instead of 'in' for PREFER_DATES_FROM string comparisons - Use datetime.now(tz=timezone.utc) consistently, replacing datetime.today() Tests: - Add 5 parameterised cases covering past/future preference with 2-digit years, and 4-digit year isolation - Add dedicated test asserting RELATIVE_BASE is honoured
|
From Opus: |
…closes #445) - Add _apply_century_preference() helper to date.py, mirroring the _get_correct_leap_year pattern from parser.py for consistency. - Helper shifts 2-digit year dates ±100 years based on PREFER_DATES_FROM; on ValueError (Feb 29 → non-leap year) it finds the nearest valid leap year in the preferred direction using get_next/previous_leap_year. - parse_with_formats now uses RELATIVE_BASE (falling back to UTC now) for both the missing-year branch and the 2-digit-year branch. - Fix tz-aware RELATIVE_BASE crash: now is normalised to naive before comparison, preventing TypeError between offset-naive and offset-aware datetimes. - Add 8 regression tests pinned to RELATIVE_BASE=datetime(2026, 1, 1) to avoid time-bomb failures.
Description
Fixes #445 - The
settings.PREFER_DATES_FROMsetting now correctly applies whendate_formatsare explicitly specified with 2-digit year formats (%y).Problem
Previously, when
date_formatsparameter was provided todateparser.parse(), thePREFER_DATES_FROMsetting had no effect on 2-digit year parsing:Root Cause
When
date_formatswere provided, the parser would directly use Python'sdatetime.strptime()without applying thePREFER_DATES_FROMlogic. This caused ambiguous 2-digit years to be interpreted in the wrong century.Solution
Enhanced the
parse_with_formats()function indateparser/date.pyto:%y) is used in date_formatsPREFER_DATES_FROMsetting:PREFER_DATES_FROM='past', subtract 100 yearsPREFER_DATES_FROM='future', add 100 years%Y)Changes
parse_with_formats()function with PREFER_DATES_FROM logicTesting
✅ All 24,056 existing tests pass (8 skipped, 1 xfailed)
✅ 5 new test cases specifically for this fix
Test Coverage
Before/After
Code Quality
dateparser/parser.py