Skip to content

Commit

Permalink
REF: Make PeriodArray an ExtensionArray (pandas-dev#22862)
Browse files Browse the repository at this point in the history
* WIP: PeriodArray

* WIP

* remove debug

* Just moves

* PeriodArray.shift definition

* _data type

* clean

* accessor wip

* some more wip

* tshift, shift

* Arithmetic

* repr changes

* wip

* freq setter

* Added disabled ops

* copy

* Support concat

* object ctor

* Updates

* lint

* lint

* wip

* more wip

* array-setitem

* wip

* wip

* Use ._tshift internally for datetimelike ops

In preperation for PeriodArray / DatetimeArray / TimedeltaArray.

Index.shift has a different meaning from ExtensionArray.shift.

- Index.shift pointwise shifts each element by some amount
- ExtensionArray.shift shits the *position* of each value in the array
  padding the end with NA

This is going to get confusing. This PR tries to avoid some of that by
internally using a new `_tshift` method (time-shift) when we want to do pointwise
shifting of each value. Places that know they want that behavior (like in the
datetimelike ops) should use that.

* deep

* Squashed commit of the following:

commit 23e5cfc
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 13:10:41 2018 -0500

    Use ._tshift internally for datetimelike ops

    In preperation for PeriodArray / DatetimeArray / TimedeltaArray.

    Index.shift has a different meaning from ExtensionArray.shift.

    - Index.shift pointwise shifts each element by some amount
    - ExtensionArray.shift shits the *position* of each value in the array
      padding the end with NA

    This is going to get confusing. This PR tries to avoid some of that by
    internally using a new `_tshift` method (time-shift) when we want to do pointwise
    shifting of each value. Places that know they want that behavior (like in the
    datetimelike ops) should use that.

commit 1d9f76c
Author: Joris Van den Bossche <[email protected]>
Date:   Tue Oct 2 17:11:11 2018 +0200

    CLN: remove Index._to_embed (pandas-dev#22879)

    * CLN: remove Index._to_embed

    * pep8

commit 6247da0
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 08:50:41 2018 -0500

    Provide default implementation for `data_repated` (pandas-dev#22935)

commit 5ce06b5
Author: Matthew Roeschke <[email protected]>
Date:   Mon Oct 1 14:22:20 2018 -0700

     BUG: to_datetime preserves name of Index argument in the result (pandas-dev#22918)

    * BUG: to_datetime preserves name of Index argument in the result

    * correct test

* Squashed commit of the following:

commit bccfc3f
Merge: d65980e 9caf048
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 13:47:48 2018 -0500

    Merge remote-tracking branch 'upstream/master' into period-dtype-type

commit 9caf048
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 13:25:22 2018 -0500

    CI: change windows vm image (pandas-dev#22948)

commit d65980e
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:46:38 2018 -0500

    typo

commit e5c61fc
Merge: d7a8e1b 1d9f76c
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 10:57:59 2018 -0500

    Merge remote-tracking branch 'upstream/master' into period-dtype-type

commit d7a8e1b
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 10:57:56 2018 -0500

    Fixed

commit 598cc62
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 10:32:22 2018 -0500

    doc note

commit 83db05c
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 10:28:52 2018 -0500

    updates

commit 1d9f76c
Author: Joris Van den Bossche <[email protected]>
Date:   Tue Oct 2 17:11:11 2018 +0200

    CLN: remove Index._to_embed (pandas-dev#22879)

    * CLN: remove Index._to_embed

    * pep8

commit 6247da0
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 08:50:41 2018 -0500

    Provide default implementation for `data_repated` (pandas-dev#22935)

commit f07ab80
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 06:22:27 2018 -0500

    str, bytes

commit 8a8bdb0
Author: Tom Augspurger <[email protected]>
Date:   Mon Oct 1 21:40:59 2018 -0500

    import at top

commit 99bafdd
Author: Tom Augspurger <[email protected]>
Date:   Mon Oct 1 21:38:12 2018 -0500

    Update type for PeriodDtype

    Removed unused IntervalDtypeType

commit 5ce06b5
Author: Matthew Roeschke <[email protected]>
Date:   Mon Oct 1 14:22:20 2018 -0700

     BUG: to_datetime preserves name of Index argument in the result (pandas-dev#22918)

    * BUG: to_datetime preserves name of Index argument in the result

    * correct test

* fixup

* The rest of the EA tests

* docs

* rename to time_shift

* Squashed commit of the following:

commit 11a0d93
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 14:26:34 2018 -0500

    typerror

commit a0cd5e7
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 14:25:38 2018 -0500

    TypeError for Series

commit 2247461
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:29:29 2018 -0500

    Test op(Series[EA], EA])

commit c9fe5d3
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:21:33 2018 -0500

    make strict

commit 7ef697c
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:14:52 2018 -0500

    Use super

commit 35d4213
Merge: 0671e7d ee80803
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:11:05 2018 -0500

    Merge remote-tracking branch 'upstream/master' into ea-divmod

commit ee80803
Author: Matthew Roeschke <[email protected]>
Date:   Wed Oct 3 08:25:44 2018 -0700

     BUG: Correctly weekly resample over DST (pandas-dev#22941)

    * test resample fix

    * move the localization until needed

    * BUG: Correctly weekly resample over DST

    * Move whatsnew to new section

commit fea27f0
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 08:49:44 2018 -0500

    CI: pin moto to 1.3.4 (pandas-dev#22959)

commit 15d32bb
Author: jbrockmendel <[email protected]>
Date:   Wed Oct 3 04:32:35 2018 -0700

    [CLN] Dispatch (some) Frame ops to Series, avoiding _data.eval (pandas-dev#22019)

    * avoid casting to object dtype in mixed-type frames

    * Dispatch to Series ops in _combine_match_columns

    * comment

    * docstring

    * flake8 fixup

    * dont bother with try_cast_result

    * revert non-central change

    * simplify

    * revert try_cast_results

    * revert non-central changes

    * Fixup typo syntaxerror

    * simplify assertion

    * use dispatch_to_series in combine_match_columns

    * Pass unwrapped op where appropriate

    * catch correct error

    * whatsnew note

    * comment

    * whatsnew section

    * remove unnecessary tester

    * doc fixup

commit 3e3256b
Author: alimcmaster1 <[email protected]>
Date:   Wed Oct 3 12:23:22 2018 +0100

    Allow passing a mask to NanOps (pandas-dev#22865)

commit e756e99
Author: jbrockmendel <[email protected]>
Date:   Wed Oct 3 02:19:27 2018 -0700

    CLN: Use is_period_dtype instead of ABCPeriodIndex checks (pandas-dev#22958)

commit 03181f0
Author: Wenhuan <[email protected]>
Date:   Wed Oct 3 15:28:07 2018 +0800

    BUG: fix Series(extension array) + extension array values addition (pandas-dev#22479)

commit 04ea51d
Author: Joris Van den Bossche <[email protected]>
Date:   Wed Oct 3 09:24:36 2018 +0200

    CLN: small clean-up of IntervalIndex (pandas-dev#22956)

commit b0f9a10
Author: Tony Tao <[email protected]>
Date:   Tue Oct 2 19:01:08 2018 -0500

    DOC GH22893 Fix docstring of groupby in pandas/core/generic.py (pandas-dev#22920)

commit 08ecba8
Author: jbrockmendel <[email protected]>
Date:   Tue Oct 2 14:22:53 2018 -0700

    BUG: fix DataFrame+DataFrame op with timedelta64 dtype (pandas-dev#22696)

commit c44bad2
Author: Pamela Wu <[email protected]>
Date:   Tue Oct 2 17:16:25 2018 -0400

    CLN GH22873 Replace base excepts in pandas/core (pandas-dev#22901)

commit 8e749a3
Author: Pamela Wu <[email protected]>
Date:   Tue Oct 2 17:14:48 2018 -0400

    CLN GH22874 replace bare excepts in pandas/io/pytables.py (pandas-dev#22919)

commit 1102a33
Author: Joris Van den Bossche <[email protected]>
Date:   Tue Oct 2 22:31:36 2018 +0200

    DOC/CLN: clean-up shared_docs in generic.py (pandas-dev#20074)

commit 9caf048
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 13:25:22 2018 -0500

    CI: change windows vm image (pandas-dev#22948)

commit 0671e7d
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:10:42 2018 -0500

    Fixup

commit 1b4261f
Merge: c92a4a8 1d9f76c
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 10:58:43 2018 -0500

    Merge remote-tracking branch 'upstream/master' into ea-divmod

commit 1d9f76c
Author: Joris Van den Bossche <[email protected]>
Date:   Tue Oct 2 17:11:11 2018 +0200

    CLN: remove Index._to_embed (pandas-dev#22879)

    * CLN: remove Index._to_embed

    * pep8

commit 6247da0
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 08:50:41 2018 -0500

    Provide default implementation for `data_repated` (pandas-dev#22935)

commit c92a4a8
Author: Tom Augspurger <[email protected]>
Date:   Mon Oct 1 16:56:15 2018 -0500

    Update old test

commit 52538fa
Author: Tom Augspurger <[email protected]>
Date:   Mon Oct 1 16:51:48 2018 -0500

    BUG: divmod return type

commit 5ce06b5
Author: Matthew Roeschke <[email protected]>
Date:   Mon Oct 1 14:22:20 2018 -0700

     BUG: to_datetime preserves name of Index argument in the result (pandas-dev#22918)

    * BUG: to_datetime preserves name of Index argument in the result

    * correct test

* Squashed commit of the following:

commit 7714e79
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 10:13:06 2018 -0500

    Always return ndarray

commit 1921c6f
Merge: 01f7366 fea27f0
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 09:50:30 2018 -0500

    Merge remote-tracking branch 'upstream/master' into combine-exception

commit fea27f0
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 08:49:44 2018 -0500

    CI: pin moto to 1.3.4 (pandas-dev#22959)

commit 15d32bb
Author: jbrockmendel <[email protected]>
Date:   Wed Oct 3 04:32:35 2018 -0700

    [CLN] Dispatch (some) Frame ops to Series, avoiding _data.eval (pandas-dev#22019)

    * avoid casting to object dtype in mixed-type frames

    * Dispatch to Series ops in _combine_match_columns

    * comment

    * docstring

    * flake8 fixup

    * dont bother with try_cast_result

    * revert non-central change

    * simplify

    * revert try_cast_results

    * revert non-central changes

    * Fixup typo syntaxerror

    * simplify assertion

    * use dispatch_to_series in combine_match_columns

    * Pass unwrapped op where appropriate

    * catch correct error

    * whatsnew note

    * comment

    * whatsnew section

    * remove unnecessary tester

    * doc fixup

commit 3e3256b
Author: alimcmaster1 <[email protected]>
Date:   Wed Oct 3 12:23:22 2018 +0100

    Allow passing a mask to NanOps (pandas-dev#22865)

commit e756e99
Author: jbrockmendel <[email protected]>
Date:   Wed Oct 3 02:19:27 2018 -0700

    CLN: Use is_period_dtype instead of ABCPeriodIndex checks (pandas-dev#22958)

commit 03181f0
Author: Wenhuan <[email protected]>
Date:   Wed Oct 3 15:28:07 2018 +0800

    BUG: fix Series(extension array) + extension array values addition (pandas-dev#22479)

commit 04ea51d
Author: Joris Van den Bossche <[email protected]>
Date:   Wed Oct 3 09:24:36 2018 +0200

    CLN: small clean-up of IntervalIndex (pandas-dev#22956)

commit b0f9a10
Author: Tony Tao <[email protected]>
Date:   Tue Oct 2 19:01:08 2018 -0500

    DOC GH22893 Fix docstring of groupby in pandas/core/generic.py (pandas-dev#22920)

commit 08ecba8
Author: jbrockmendel <[email protected]>
Date:   Tue Oct 2 14:22:53 2018 -0700

    BUG: fix DataFrame+DataFrame op with timedelta64 dtype (pandas-dev#22696)

commit c44bad2
Author: Pamela Wu <[email protected]>
Date:   Tue Oct 2 17:16:25 2018 -0400

    CLN GH22873 Replace base excepts in pandas/core (pandas-dev#22901)

commit 8e749a3
Author: Pamela Wu <[email protected]>
Date:   Tue Oct 2 17:14:48 2018 -0400

    CLN GH22874 replace bare excepts in pandas/io/pytables.py (pandas-dev#22919)

commit 1102a33
Author: Joris Van den Bossche <[email protected]>
Date:   Tue Oct 2 22:31:36 2018 +0200

    DOC/CLN: clean-up shared_docs in generic.py (pandas-dev#20074)

commit 01f7366
Merge: 5372134 9caf048
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 13:50:28 2018 -0500

    Merge remote-tracking branch 'upstream/master' into combine-exception

commit 9caf048
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 13:25:22 2018 -0500

    CI: change windows vm image (pandas-dev#22948)

commit 5372134
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:35:07 2018 -0500

    fixed move

commit ce1a3c6
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:32:11 2018 -0500

    fixed move

commit b9c7e4b
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:28:57 2018 -0500

    remove old note

commit a4a2933
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:24:48 2018 -0500

    handle test

commit be63feb
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:19:17 2018 -0500

    move test

commit 0eef0cf
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:18:18 2018 -0500

    move back

commit 2183f7b
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:17:28 2018 -0500

    api

commit 85fc5d8
Merge: 9059c0d 1d9f76c
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:15:52 2018 -0500

    Merge remote-tracking branch 'upstream/master' into combine-exception

commit 1d9f76c
Author: Joris Van den Bossche <[email protected]>
Date:   Tue Oct 2 17:11:11 2018 +0200

    CLN: remove Index._to_embed (pandas-dev#22879)

    * CLN: remove Index._to_embed

    * pep8

commit 6247da0
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 08:50:41 2018 -0500

    Provide default implementation for `data_repated` (pandas-dev#22935)

commit 9059c0d
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 06:33:15 2018 -0500

    Note

commit 0c53f08
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 06:30:54 2018 -0500

    Imports

commit ce94bf9
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 06:28:16 2018 -0500

    Moves

commit fdd43c4
Author: Tom Augspurger <[email protected]>
Date:   Mon Oct 1 21:26:09 2018 -0500

    Closes pandas-dev#22850

commit 5ce06b5
Author: Matthew Roeschke <[email protected]>
Date:   Mon Oct 1 14:22:20 2018 -0700

     BUG: to_datetime preserves name of Index argument in the result (pandas-dev#22918)

    * BUG: to_datetime preserves name of Index argument in the result

    * correct test

* Squashed commit of the following:

commit 11a0d93
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 14:26:34 2018 -0500

    typerror

commit a0cd5e7
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 14:25:38 2018 -0500

    TypeError for Series

commit 2247461
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:29:29 2018 -0500

    Test op(Series[EA], EA])

commit c9fe5d3
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:21:33 2018 -0500

    make strict

commit 7ef697c
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:14:52 2018 -0500

    Use super

commit 35d4213
Merge: 0671e7d ee80803
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:11:05 2018 -0500

    Merge remote-tracking branch 'upstream/master' into ea-divmod

commit 0671e7d
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:10:42 2018 -0500

    Fixup

commit 1b4261f
Merge: c92a4a8 1d9f76c
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 10:58:43 2018 -0500

    Merge remote-tracking branch 'upstream/master' into ea-divmod

commit c92a4a8
Author: Tom Augspurger <[email protected]>
Date:   Mon Oct 1 16:56:15 2018 -0500

    Update old test

commit 52538fa
Author: Tom Augspurger <[email protected]>
Date:   Mon Oct 1 16:51:48 2018 -0500

    BUG: divmod return type

* fixed merge conflict

* Handle divmod test

* extension tests passing

* Squashed commit of the following:

commit c9d6e89
Author: Tom Augspurger <[email protected]>
Date:   Thu Oct 4 08:34:22 2018 -0500

    xpass -> skip

commit 95d5cbf
Author: Tom Augspurger <[email protected]>
Date:   Thu Oct 4 08:22:17 2018 -0500

    typo, import

commit 4e9b7f0
Author: Tom Augspurger <[email protected]>
Date:   Thu Oct 4 08:18:40 2018 -0500

    doc update

commit cc2bfc8
Merge: 11a0d93 fe67b94
Author: Tom Augspurger <[email protected]>
Date:   Thu Oct 4 08:15:46 2018 -0500

    Merge remote-tracking branch 'upstream/master' into ea-divmod

commit fe67b94
Author: Tom Augspurger <[email protected]>
Date:   Thu Oct 4 06:55:09 2018 -0500

    Update type for PeriodDtype / DatetimeTZDtype / IntervalDtype (pandas-dev#22938)

commit b12e5ba
Author: Tom Augspurger <[email protected]>
Date:   Thu Oct 4 06:30:29 2018 -0500

    Safer is dtype (pandas-dev#22975)

commit c19c805
Author: Tom Augspurger <[email protected]>
Date:   Thu Oct 4 06:27:54 2018 -0500

    Catch Exception in combine (pandas-dev#22936)

commit d553ab3
Author: Anjali2019 <[email protected]>
Date:   Thu Oct 4 13:24:06 2018 +0200

    TST: Fixturize series/test_combine_concat.py (pandas-dev#22964)

commit 4c78b97
Author: Anjali2019 <[email protected]>
Date:   Thu Oct 4 13:23:39 2018 +0200

    TST: Fixturize series/test_constructors.py (pandas-dev#22965)

commit 45d3bb7
Author: Anjali2019 <[email protected]>
Date:   Thu Oct 4 13:23:20 2018 +0200

    TST: Fixturize series/test_datetime_values.py (pandas-dev#22966)

commit f1a22ff
Author: Anjali2019 <[email protected]>
Date:   Thu Oct 4 13:22:21 2018 +0200

    TST: Fixturize series/test_dtypes.py (pandas-dev#22967)

commit abf68fd
Author: Anjali2019 <[email protected]>
Date:   Thu Oct 4 13:21:45 2018 +0200

    TST: Fixturize series/test_io.py (pandas-dev#22972)

commit e6b0c29
Author: Anjali2019 <[email protected]>
Date:   Thu Oct 4 13:20:46 2018 +0200

    TST: Fixturize series/test_missing.py (pandas-dev#22973)

commit 9b405b8
Author: Joris Van den Bossche <[email protected]>
Date:   Thu Oct 4 13:16:28 2018 +0200

    CLN: values is required argument in _shallow_copy_with_infer (pandas-dev#22983)

commit c282e31
Author: h-vetinari <[email protected]>
Date:   Thu Oct 4 03:34:35 2018 +0200

    Fix ASV import error (pandas-dev#22978)

commit 11a0d93
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 14:26:34 2018 -0500

    typerror

commit a0cd5e7
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 14:25:38 2018 -0500

    TypeError for Series

commit 2247461
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:29:29 2018 -0500

    Test op(Series[EA], EA])

commit c9fe5d3
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:21:33 2018 -0500

    make strict

commit 7ef697c
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:14:52 2018 -0500

    Use super

commit 35d4213
Merge: 0671e7d ee80803
Author: Tom Augspurger <[email protected]>
Date:   Wed Oct 3 13:11:05 2018 -0500

    Merge remote-tracking branch 'upstream/master' into ea-divmod

commit 0671e7d
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 11:10:42 2018 -0500

    Fixup

commit 1b4261f
Merge: c92a4a8 1d9f76c
Author: Tom Augspurger <[email protected]>
Date:   Tue Oct 2 10:58:43 2018 -0500

    Merge remote-tracking branch 'upstream/master' into ea-divmod

commit c92a4a8
Author: Tom Augspurger <[email protected]>
Date:   Mon Oct 1 16:56:15 2018 -0500

    Update old test

commit 52538fa
Author: Tom Augspurger <[email protected]>
Date:   Mon Oct 1 16:51:48 2018 -0500

    BUG: divmod return type

* merge conflict

* wip

* indexes passing

* op names

* extension, arrays passing

* fixup

* lint

* Fixed to_timestamp

* Same error message for index, series

* Fix freq handling in to_timestamp

* dtype update

* accept kwargs

* fixups

* updates

* explicit

* add to assert

* wip period_array

* wip period_array

* order

* sort order

* test for hashing

* update

* lint

* boxing

* fix fixtures

* infer

* Remove seemingly unreachable code

* lint

* wip

* Updates for master

* simplify

* wip

* remove view

* simplify

* lint

* Removed add_comparison_methods

* xfail op

* remove some

* constructors

* Constructor cleanup

* misc fixups

* more xfails

* typo

* Added asi8

* Allow setting nan

* revert breaking docs

* Override _add_sub_int_array

* lint

* Update PeriodIndex._simple_new

* Clean up uses of .values, ._values, ._ndarray_values, ._data

* one more values

* remove xfails

* Fixed freq handling in _shallow_copy with a freq

* test updates

* API: Keep PeriodIndex.values an ndarray

* BUG: Raise for non-equal freq in take

* Punt on DataFrame.replace specializing

* lint

* fixed xfail message

* TST: _from_datetime64

* Fixups

- Perf in period_array
- pyarrow error
- py2 compat

* escape

* dtype

* revert and unxfail values

* error catching

* isort

* Avoid PeriodArray.values

* clarify _box_func usage

* TST: unxfail ops tests

* Avoid use of .values

* __setitem__ type

* Misc cleanups

* docstring on PeriodArray
* examples for period_array
* remove _box_values_as_index
* names
* object_dtype
* use __sub__

* lint

* API: remove ordinal from period_array

* catch exception

* misc cleanup

* Handle astype integer size

* Bump test coverage

* remove partial test

* close bracket

* change the test

* isort

* consistent _data

* lint

* ndarray_values -> asi8

* colocate ops

* refactor PeriodIndex.item

remove unused method

* return NotImplemented for Series / Index

* remove xpass

* release note

* types, use data

* remove ufunc xpass
  • Loading branch information
TomAugspurger authored and Pingviinituutti committed Feb 28, 2019
1 parent 9f56e15 commit 9f791ad
Showing 51 changed files with 1,779 additions and 577 deletions.
29 changes: 19 additions & 10 deletions doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
@@ -145,33 +145,41 @@ Current Behavior:

.. _whatsnew_0240.enhancements.interval:

Storing Interval Data in Series and DataFrame
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Storing Interval and Period Data in Series and DataFrame
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Interval data may now be stored in a ``Series`` or ``DataFrame``, in addition to an
:class:`IntervalIndex` like previously (:issue:`19453`).
Interval and Period data may now be stored in a ``Series`` or ``DataFrame``, in addition to an
:class:`IntervalIndex` and :class:`PeriodIndex` like previously (:issue:`19453`, :issue:`22862`).

.. ipython:: python

ser = pd.Series(pd.interval_range(0, 5))
ser
ser.dtype

Previously, these would be cast to a NumPy array of ``Interval`` objects. In general,
this should result in better performance when storing an array of intervals in
a :class:`Series`.
And for periods:

.. ipython:: python

pser = pd.Series(pd.date_range("2000", freq="D", periods=5))
pser
pser.dtype

Previously, these would be cast to a NumPy array with object dtype. In general,
this should result in better performance when storing an array of intervals or periods
in a :class:`Series` or column of a :class:`DataFrame`.

Note that the ``.values`` of a ``Series`` containing intervals is no longer a NumPy
Note that the ``.values`` of a ``Series`` containing one of these types is no longer a NumPy
array, but rather an ``ExtensionArray``:

.. ipython:: python

ser.values
pser.values

This is the same behavior as ``Series.values`` for categorical data. See
:ref:`whatsnew_0240.api_breaking.interval_values` for more.


.. _whatsnew_0240.enhancements.other:

Other Enhancements
@@ -360,7 +368,7 @@ New Behavior:
This mirrors ``CategoricalIndex.values``, which returns a ``Categorical``.

For situations where you need an ``ndarray`` of ``Interval`` objects, use
:meth:`numpy.asarray` or ``idx.astype(object)``.
:meth:`numpy.asarray`.

.. ipython:: python

@@ -810,6 +818,7 @@ update the ``ExtensionDtype._metadata`` tuple to match the signature of your
- Updated the ``.type`` attribute for ``PeriodDtype``, ``DatetimeTZDtype``, and ``IntervalDtype`` to be instances of the dtype (``Period``, ``Timestamp``, and ``Interval`` respectively) (:issue:`22938`)
- :func:`ExtensionArray.isna` is allowed to return an ``ExtensionArray`` (:issue:`22325`).
- Support for reduction operations such as ``sum``, ``mean`` via opt-in base class method override (:issue:`22762`)
- :meth:`Series.unstack` no longer converts extension arrays to object-dtype ndarrays. The output ``DataFrame`` will now have the same dtype as the input. This changes behavior for Categorical and Sparse data (:issue:`23077`).

.. _whatsnew_0240.api.incompatibilities:

2 changes: 1 addition & 1 deletion pandas/core/arrays/__init__.py
Original file line number Diff line number Diff line change
@@ -4,7 +4,7 @@
from .categorical import Categorical # noqa
from .datetimes import DatetimeArrayMixin # noqa
from .interval import IntervalArray # noqa
from .period import PeriodArrayMixin # noqa
from .period import PeriodArray, period_array # noqa
from .timedeltas import TimedeltaArrayMixin # noqa
from .integer import ( # noqa
IntegerArray, integer_array)
23 changes: 19 additions & 4 deletions pandas/core/arrays/categorical.py
Original file line number Diff line number Diff line change
@@ -29,6 +29,7 @@
is_categorical_dtype,
is_float_dtype,
is_integer_dtype,
is_object_dtype,
is_list_like, is_sequence,
is_scalar, is_iterator,
is_dict_like)
@@ -342,7 +343,6 @@ def __init__(self, values, categories=None, ordered=None, dtype=None,
# a.) use categories, ordered
# b.) use values.dtype
# c.) infer from values

if dtype is not None:
# The dtype argument takes precedence over values.dtype (if any)
if isinstance(dtype, compat.string_types):
@@ -2478,11 +2478,26 @@ def _get_codes_for_values(values, categories):
utility routine to turn values into codes given the specified categories
"""
from pandas.core.algorithms import _get_data_algo, _hashtables
if is_dtype_equal(values.dtype, categories.dtype):
dtype_equal = is_dtype_equal(values.dtype, categories.dtype)

if dtype_equal:
# To prevent erroneous dtype coercion in _get_data_algo, retrieve
# the underlying numpy array. gh-22702
values = getattr(values, 'values', values)
categories = getattr(categories, 'values', categories)
values = getattr(values, '_ndarray_values', values)
categories = getattr(categories, '_ndarray_values', categories)
elif (is_extension_array_dtype(categories.dtype) and
is_object_dtype(values)):
# Support inferring the correct extension dtype from an array of
# scalar objects. e.g.
# Categorical(array[Period, Period], categories=PeriodIndex(...))
try:
values = (
categories.dtype.construct_array_type()._from_sequence(values)
)
except Exception:
# but that may fail for any reason, so fall back to object
values = ensure_object(values)
categories = ensure_object(categories)
else:
values = ensure_object(values)
categories = ensure_object(categories)
18 changes: 4 additions & 14 deletions pandas/core/arrays/datetimelike.py
Original file line number Diff line number Diff line change
@@ -474,17 +474,8 @@ def _addsub_int_array(self, other, op):
result : same class as self
"""
assert op in [operator.add, operator.sub]
if is_period_dtype(self):
# easy case for PeriodIndex
if op is operator.sub:
other = -other
res_values = checked_add_with_arr(self.asi8, other,
arr_mask=self._isnan)
res_values = res_values.view('i8')
res_values[self._isnan] = iNaT
return self._from_ordinals(res_values, freq=self.freq)

elif self.freq is None:

if self.freq is None:
# GH#19123
raise NullFrequencyError("Cannot shift with no freq")

@@ -524,10 +515,9 @@ def _addsub_offset_array(self, other, op):
left = lib.values_from_object(self.astype('O'))

res_values = op(left, np.array(other))
kwargs = {}
if not is_period_dtype(self):
kwargs['freq'] = 'infer'
return type(self)(res_values, **kwargs)
return type(self)(res_values, freq='infer')
return self._from_sequence(res_values)

@deprecate_kwarg(old_arg_name='n', new_arg_name='periods')
def shift(self, periods, freq=None):
4 changes: 2 additions & 2 deletions pandas/core/arrays/datetimes.py
Original file line number Diff line number Diff line change
@@ -832,7 +832,7 @@ def to_period(self, freq=None):
pandas.PeriodIndex: Immutable ndarray holding ordinal values
pandas.DatetimeIndex.to_pydatetime: Return DatetimeIndex as object
"""
from pandas.core.arrays import PeriodArrayMixin
from pandas.core.arrays import PeriodArray

if self.tz is not None:
warnings.warn("Converting to PeriodArray/Index representation "
@@ -847,7 +847,7 @@ def to_period(self, freq=None):

freq = get_period_alias(freq)

return PeriodArrayMixin(self.values, freq=freq)
return PeriodArray._from_datetime64(self.values, freq, tz=self.tz)

def to_perioddelta(self, freq):
"""
Loading

0 comments on commit 9f791ad

Please sign in to comment.