Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date parsing issue "Month out of range" #26

Closed
kescobo opened this issue Aug 2, 2018 · 3 comments
Closed

Date parsing issue "Month out of range" #26

kescobo opened this issue Aug 2, 2018 · 3 comments

Comments

@kescobo
Copy link

kescobo commented Aug 2, 2018

I've been running into a weird date parsing issue, and I can't sort out what the pattern is, though I've managed to nail down a MWE

The linked csv has 4 rows of dates.

julia> load("parse_test.csv") |> DataFrame
ERROR: ArgumentError: Month: 27 out of range (1:12)
Stacktrace:
 [1] Date(::Int64, ::Int64, ::Int64) at ./dates/types.jl:204
 [2] tryparsenext(::TextParse.DateTimeToken{Date,DateFormat{Symbol("yyyy/mm/dd"),Tuple{Base.Dates.DatePart{'y'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'m'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'d'}}}}, ::String, ::Int64, ::Int64, ::TextParse.LocalOpts) at /Users/kev/.julia/v0.6/TextParse/src/field.jl:431
 [3] macro expansion at /Users/kev/.julia/v0.6/TextParse/src/util.jl:23 [inlined]
 [4] tryparsenext(::TextParse.Field{Date,TextParse.DateTimeToken{Date,DateFormat{Symbol("yyyy/mm/dd"),Tuple{Base.Dates.DatePart{'y'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'m'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'d'}}}}}, ::String, ::Int64, ::Int64, ::TextParse.LocalOpts) at /Users/kev/.julia/v0.6/TextParse/src/field.jl:569
#...

(the stack trace is super long, let me know if it would be useful to post the whole thing)

There are 3 27s, two in the second row, and one in the last row. If I remove just the last row, it works.

julia> load("parse_test.csv") |> DataFrame
3×6 DataFrames.DataFrame
│ Row │ c1         │ c2         │ c3         │ c4         │ c5         │ c6         │
├─────┼────────────┼────────────┼────────────┼────────────┼────────────┼────────────┤
│ 10016-08-080011-11-100016-08-080010-01-150010-01-150016-08-08 │
│ 20016-05-270010-12-130016-05-270012-01-150012-01-150016-01-01 │
│ 30016-06-150011-08-040009-12-210011-01-090011-01-090009-11-24 │

julia>

But if I leave the 4th row in and just change the 27 in the last row to a 2, I get the same ERROR: ArgumentError: Month: 27 out of range (1:12).

If I change all the 27s to 2s, I now get ERROR: ArgumentError: Month: 21 out of range (1:12), and again this error goes away if I delete the last row, even though there are no 21s in the last row.

There's not just something weird with that row - this is part of a much larger csv file, and removing only row 4 does not stop the error.

@kescobo
Copy link
Author

kescobo commented Aug 2, 2018

Just now looking at the output, I see it's taking the 2 digit years to be 0016 rather than 2016, which is a different problem I suppose...

@davidanthoff
Copy link
Member

Can you try to load the file with TextParse.jl directly? That is the parsing package I use under the hood, and it would be good to narrow down whether this is a problem in CSVFiles.jl or TextParse.jl.

@kescobo
Copy link
Author

kescobo commented Aug 2, 2018

Good call - looks like it's coming from there. Sorry about that, new issue: queryverse/TextParse.jl#69

@kescobo kescobo closed this as completed Aug 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants