Skip to content

[AMORO-2810]: Support multiple nodes to access Amoro Rest service in a high availability environment#3567

Closed
czy006 wants to merge 8 commits into
apache:masterfrom
czy006:zk-ha-rest
Closed

[AMORO-2810]: Support multiple nodes to access Amoro Rest service in a high availability environment#3567
czy006 wants to merge 8 commits into
apache:masterfrom
czy006:zk-ha-rest

Conversation

@czy006

@czy006 czy006 commented May 16, 2025

Copy link
Copy Markdown
Contributor

Support multiple nodes to access Amoro Rest service in a high availability environment

Why are the changes needed?

  • Support multiple nodes to access Amoro Rest service in a high availability
  • Just Leader node running OptimizingService

Close #2810.

Brief change log

  • Support multiple nodes to access Amoro Rest service in a high availability environment

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? (not documented)

@github-actions github-actions Bot added the module:ams-server Ams server module label May 16, 2025
@czy006 czy006 force-pushed the zk-ha-rest branch 2 times, most recently from 934c69a to eba90fe Compare May 16, 2025 07:05
@czy006

czy006 commented May 22, 2025

Copy link
Copy Markdown
Contributor Author

3 amoro nodes,normal change service to leader

image

@czy006 czy006 requested a review from baiyangtx May 22, 2025 07:58
@czy006 czy006 changed the title [Draft]Support multiple nodes to access Amoro Rest service in a high availability environment [improve]: Support multiple nodes to access Amoro Rest service in a high availability environment May 22, 2025
@codecov-commenter

codecov-commenter commented May 23, 2025

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 21.76%. Comparing base (e69710a) to head (55b63e4).
⚠️ Report is 247 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##             master    #3567   +/-   ##
=========================================
  Coverage     21.76%   21.76%           
  Complexity     2391     2391           
=========================================
  Files           436      436           
  Lines         40498    40498           
  Branches       5743     5743           
=========================================
  Hits           8816     8816           
  Misses        30935    30935           
  Partials        747      747           
Flag Coverage Δ
trino 21.76% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@zhoujinsong zhoujinsong left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the work! I left a minor comment, PTAL.

Comment thread amoro-ams/src/main/java/org/apache/amoro/server/AmoroServiceContainer.java Outdated
@czy006 czy006 requested review from xxubai and zhoujinsong May 26, 2025 06:56

@xxubai xxubai left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in total

Comment thread amoro-ams/src/main/java/org/apache/amoro/server/AmoroServiceContainer.java Outdated
@czy006 czy006 requested a review from xxubai May 29, 2025 06:23

@xxubai xxubai left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But I'm not good at HA serivces, take my +1.

service.startOptimizingService();
service.waitFollowerShip();
// become follower, dispose optimizingService stop
service.stopOptimizingService();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why didn't you put this in service.dispose? or you may change service.dispose in finally block to service.stopOptimizingService ???

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dispose will be close all service (include rest) .if this node become follower,we just close optimizingService,and will be run waitting become Leader service.waitLeaderShip()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the finally block will dispose all services. so you should use stopOptimizingService to repalce dsipose in finally block ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the dispose process will close the rest service. It should be fixed.

I open a PR to fix it: czy006#2
@czy006 You can checkout it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I am using the new logic to verify it, which will take some time. At the same time, we are designing a more general high-availability solution, which will support K8S in the future and will be launched later. cc @Aireed @zhoujinsong @klion26

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great! Look forward it!

@xxubai xxubai changed the title [improve]: Support multiple nodes to access Amoro Rest service in a high availability environment [AMORO-2810]: Support multiple nodes to access Amoro Rest service in a high availability environment Jun 5, 2025
@klion26

klion26 commented Jun 6, 2025

Copy link
Copy Markdown
Member

Sorry for jumping in this late.

After this change, will there be two Amoro servers handle the same table update(put/post etc rest) logic? If this is yes, will this lead to some wrong result/state?

@czy006

czy006 commented Jun 9, 2025

Copy link
Copy Markdown
Contributor Author

Sorry for jumping in this late.

After this change, will there be two Amoro servers handle the same table update(put/post etc rest) logic? If this is yes, will this lead to some wrong result/state?

Optimization services will not be processed simultaneously, we ensure that only one master node handles Optimizer. Other child nodes can handle Rest requests, such as creating and deleting catalogs. Will this have an impact?

@klion26

klion26 commented Jun 14, 2025

Copy link
Copy Markdown
Member

Sorry for jumping in this late.
After this change, will there be two Amoro servers handle the same table update(put/post etc rest) logic? If this is yes, will this lead to some wrong result/state?

Optimization services will not be processed simultaneously, we ensure that only one master node handles Optimizer. Other child nodes can handle Rest requests, such as creating and deleting catalogs. Will this have an impact?

As there is more than one server that will update the same table's info/state, we need to handle this carefully to avoid the inconsistent state of concurrent modification. Maybe we need to add a doc to analysis this

@zhoujinsong

Copy link
Copy Markdown
Contributor

Rplaced by #3737.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:ams-server Ams server module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Support multi-instance AMS REST service

6 participants