Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] User can use different hadoop-user to submit application #3401

Merged
merged 7 commits into from
Dec 28, 2023

Conversation

lordk911
Copy link
Contributor

What changes were proposed in this pull request

Issue Number: close #3222

Brief change log

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

Does this pull request potentially affect one of the following parts

  • Dependencies (does it add or upgrade a dependency): no

@wolfboys
Copy link
Member

Overall, it looks good 👍 Keep up the good work! I'm really looking forward to seeing the next update. 😊

@github-actions github-actions bot added the BUILD label Dec 18, 2023
@lordk911
Copy link
Contributor Author

@wolfboys I think the key changes are in YarnApplicationClient class, setConfig method directly using UserGroupInformation.setLoginUser method is not correct, should use ProxyUser, This requires either changing the doSubmit method declaration or moving the setConfig changes into the doSubmit method. And config the hadoop-user's keytab on the job definition page is required in a Kerberos-enabled environment too.

@lordk911
Copy link
Contributor Author

I think the key changes are in YarnApplicationClient class, setConfig method directly using UserGroupInformation.setLoginUser method is not correct, should use ProxyUser, This requires either changing the doSubmit method declaration or moving the setConfig changes into the doSubmit method. And config the hadoop-user's keytab on the job definition page is required in a Kerberos-enabled environment too.

@wolfboys Do you have any suggestions for this?
1、if use UserGroupInformation.setLoginUser ,The CurrentUser of the other tenant is changed.
2、if use ProxyUser , the user configed as streampark.hadoop-user-name should have the ability to represent other users.
3、if kerberos is enabled, each tenant needs to upload their own keytab, and the keytab needs to be protected

@wolfboys
Copy link
Member

I think the key changes are in YarnApplicationClient class, setConfig method directly using UserGroupInformation.setLoginUser method is not correct, should use ProxyUser, This requires either changing the doSubmit method declaration or moving the setConfig changes into the doSubmit method. And config the hadoop-user's keytab on the job definition page is required in a Kerberos-enabled environment too.

@wolfboys Do you have any suggestions for this? 1、if use UserGroupInformation.setLoginUser ,The CurrentUser of the other tenant is changed. 2、if use ProxyUser , the user configed as streampark.hadoop-user-name should have the ability to represent other users. 3、if kerberos is enabled, each tenant needs to upload their own keytab, and the keytab needs to be protected

If Kerberos is involved, things will become complicated. How to manage the Kerberos configurations for different users is a problem we have to face. I suggest not considering Kerberos for now.

@lordk911
Copy link
Contributor Author

@wolfboys The code has been modified to use proxyuser

@wolfboys
Copy link
Member

@wolfboys The code has been modified to use proxyuser

Thanks for your contribution, I will review it later

@lordk911 lordk911 marked this pull request as ready for review December 26, 2023 08:30
@caicancai
Copy link
Member

You can modify your title, for example [Feature] User can use different hadoop-user to submit application, the first letter is preferably capitalized, thank you

@wolfboys
Copy link
Member

The hadoopUser also needs to be set in the application copy method

@wolfboys
Copy link
Member

One more question, have you tested it? Will the set hadoopUser work ok?

@lordk911 lordk911 changed the title user can use different hadoop-user to submit application [Feature] User can use different hadoop-user to submit application Dec 28, 2023
@lordk911
Copy link
Contributor Author

One more question, have you tested it? Will the set hadoopUser work ok?

I've tested it against streampark 2.1.2

@lordk911
Copy link
Contributor Author

lordk911 commented Dec 28, 2023

The hadoopUser also needs to be set in the application copy method

done

Copy link
Member

@wolfboys wolfboys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good job🤗

@wolfboys wolfboys merged commit 0533ec3 into apache:dev Dec 28, 2023
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] when submit application to yarn , use the user logged in to streampark.
3 participants