Skip to content

Fix dataframe warnings on ChainedAssignmentError#38428

Draft
shunping wants to merge 13 commits intoapache:masterfrom
shunping:fix-df-warnings
Draft

Fix dataframe warnings on ChainedAssignmentError#38428
shunping wants to merge 13 commits intoapache:masterfrom
shunping:fix-df-warnings

Conversation

@shunping
Copy link
Copy Markdown
Collaborator

@shunping shunping commented May 9, 2026

  • Fixed a flaky approximate quantile test.

    File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py310/build/srcs/sdks/python/apache_beam/testing/util.py", line 202, in _equal
      raise BeamAssertException(msg)
    apache_beam.testing.util.BeamAssertException: Failed assert: [[(99.9, 499), (72.5, 225), (50.0, 0)]] == [[(99.9, 499), (22.5, 275), (50.0, 0)]], unexpected elements [[(99.9, 499), (22.5, 275), (50.0, 0)]], missing elements [[(99.9, 499), (72.5, 225), (50.0, 0)]] [while running 'checkGloballyWithKeyAndReversed/Match']
    self = <apache_beam.transforms.stats_test.ApproximateQuantilesTest testMethod=test_batched_quantiles>
    
         def test_batched_quantiles(self):
     >     with TestPipeline() as p:
    
    apache_beam/transforms/stats_test.py:482: 
    
  • Fixed dataframe warnings on ChainedAssignmentError

    /Users/runner/work/beam/beam/sdks/python/apache_beam/dataframe/schemas.py:100: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
    You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
    A typical example is when you are setting values in a column of a DataFrame, like:
    
    df["col"][row_indexer] = value
    
    Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.
    
    See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    
      proxy[name] = proxy[name].astype(dtype)
    
    /Users/runner/work/beam/beam/sdks/python/target/.tox/py314-macos/lib/python3.14/site-packages/apache_beam/typehints/pandas_type_compatibility.py:227: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
    You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
    A typical example is when you are setting values in a column of a DataFrame, like:
    
    df["col"][row_indexer] = value
    
    Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.
    
    See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    
      batch[column] = batch[column].astype(dtype_from_typehint(typehint))
    

@github-actions github-actions Bot added the python label May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant