From 12c275c866ac4d6e655b584680cd9b3bee257998 Mon Sep 17 00:00:00 2001 From: Matt McKay Date: Fri, 19 Jun 2026 12:14:04 +1000 Subject: [PATCH 1/2] [pandas, pandas_panel] Update lectures for pandas 3.0 compatibility anaconda 2026.06 (merged in #562) ships pandas 3.0.3, so the interim pandas>=3 pip pin and anaconda=2025.12 are no longer needed. This branch now carries only the lecture updates: - pandas_panel: drop redundant future_stack=True from .stack() calls (the new stacking layout is the default in pandas 3.0) - pandas_panel: text now says axis=1 in groupby is "removed" not "deprecated" (it is fully removed in 3.0) - pandas: df.where(df.POP >= 20000, False) -> df.where(df.POP >= 20000) so the example fills non-matching rows with NaN instead of scattering False through the string columns Co-Authored-By: Claude Opus 4.8 --- lectures/pandas.md | 4 ++-- lectures/pandas_panel.md | 10 +++++----- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/lectures/pandas.md b/lectures/pandas.md index cec984bf..f231c826 100644 --- a/lectures/pandas.md +++ b/lectures/pandas.md @@ -349,10 +349,10 @@ df.loc[complexCondition] The ability to make changes in dataframes is important to generate a clean dataset for future analysis. -**1.** We can use `df.where()` conveniently to "keep" the rows we have selected and replace the rest rows with any other values +**1.** We can use `df.where()` conveniently to "keep" the rows we have selected and replace the rest rows with `NaN` ```{code-cell} ipython3 -df.where(df.POP >= 20000, False) +df.where(df.POP >= 20000) ``` **2.** We can simply use `.loc[]` to specify the column that we want to modify, and assign values diff --git a/lectures/pandas_panel.md b/lectures/pandas_panel.md index 8bb6b29e..8cb3b762 100644 --- a/lectures/pandas_panel.md +++ b/lectures/pandas_panel.md @@ -150,14 +150,14 @@ the row index (`.unstack()` works in the opposite direction - try it out) ```{code-cell} ipython3 -realwage.stack(future_stack=True).head() +realwage.stack().head() ``` We can also pass in an argument to select the level we would like to stack ```{code-cell} ipython3 -realwage.stack(level='Country', future_stack=True).head() # future_stack=True is required until pandas>3.0 +realwage.stack(level='Country').head() ``` Using a `DatetimeIndex` makes it easy to select a particular time @@ -167,7 +167,7 @@ Selecting one year and stacking the two lower levels of the `MultiIndex` creates a cross-section of our panel data ```{code-cell} ipython3 -realwage.loc['2015'].stack(level=(1, 2), future_stack=True).transpose().head() # future_stack=True is required until pandas>3.0 +realwage.loc['2015'].stack(level=(1, 2)).transpose().head() ``` For the rest of lecture, we will work with a dataframe of the hourly @@ -401,7 +401,7 @@ plt.show() We can also specify a level of the `MultiIndex` (in the column axis) to aggregate over. -In the case of `groupby` we need to use `.T` to transpose the columns into rows as `pandas` has deprecated the use of `axis=1` in the `groupby` method. +In the case of `groupby` we need to use `.T` to transpose the columns into rows as `pandas` has removed support for `axis=1` in the `groupby` method. ```{code-cell} ipython3 merged.T.groupby(level='Continent').mean().head() @@ -432,7 +432,7 @@ plt.show() summary statistics ```{code-cell} ipython3 -merged.stack(future_stack=True).describe() +merged.stack().describe() ``` This is a simplified way to use `groupby`. From 4f81ab6623d8416d038b6d8217e12656a86969ef Mon Sep 17 00:00:00 2001 From: Matt McKay Date: Fri, 19 Jun 2026 12:41:14 +1000 Subject: [PATCH 2/2] Address Copilot grammar feedback - pandas: "replace the rest rows" -> "replace the remaining rows" - pandas_panel: add commas around the introductory phrase / before "as" to fix the run-on sentence about groupby axis=1 removal Co-Authored-By: Claude Opus 4.8 --- lectures/pandas.md | 2 +- lectures/pandas_panel.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/lectures/pandas.md b/lectures/pandas.md index f231c826..8218f44d 100644 --- a/lectures/pandas.md +++ b/lectures/pandas.md @@ -349,7 +349,7 @@ df.loc[complexCondition] The ability to make changes in dataframes is important to generate a clean dataset for future analysis. -**1.** We can use `df.where()` conveniently to "keep" the rows we have selected and replace the rest rows with `NaN` +**1.** We can use `df.where()` conveniently to "keep" the rows we have selected and replace the remaining rows with `NaN` ```{code-cell} ipython3 df.where(df.POP >= 20000) diff --git a/lectures/pandas_panel.md b/lectures/pandas_panel.md index 8cb3b762..fcb44255 100644 --- a/lectures/pandas_panel.md +++ b/lectures/pandas_panel.md @@ -401,7 +401,7 @@ plt.show() We can also specify a level of the `MultiIndex` (in the column axis) to aggregate over. -In the case of `groupby` we need to use `.T` to transpose the columns into rows as `pandas` has removed support for `axis=1` in the `groupby` method. +In the case of `groupby`, we need to use `.T` to transpose the columns into rows, as `pandas` has removed support for `axis=1` in the `groupby` method. ```{code-cell} ipython3 merged.T.groupby(level='Continent').mean().head()