-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MINOR] improvement(spark-client): put sparkConf as extra properties while client request accessCluster #2254
[MINOR] improvement(spark-client): put sparkConf as extra properties while client request accessCluster #2254
Conversation
e7e5f4f
to
827ee04
Compare
Test Results 2 943 files + 3 2 943 suites +3 6h 18m 14s ⏱️ + 7m 42s For more details on these failures, see this check. Results for commit 6c16c8c. ± Comparison against base commit 54611f3. ♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Although putting all confs into the general properites looks a little bit strange
@@ -123,6 +123,7 @@ private boolean tryAccessCluster() { | |||
Map<String, String> extraProperties = Maps.newHashMap(); | |||
extraProperties.put( | |||
ACCESS_INFO_REQUIRED_SHUFFLE_NODES_NUM, String.valueOf(assignmentShuffleNodesNum)); | |||
extraProperties.putAll(RssSparkConfig.sparkConfToMap(sparkConf)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't put Spark configurations to coordinator. Coordinator isn't related to compute framework. The spark configurations is coupled with Spark framework. You can have an extractor instead of passing them directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is related to #2255 , I want to determine whether to ban an app by a specific ban id
which extract from one of the spark client config, but I don't want client config it concretely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether it is good design to have a ban id list. It seems that it isn't common enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not common if bannedIdProvider contains spark.
@jerqi Thanks for your suggestion, removed the |
// Put all spark conf into extra properties, except which length is longer than 100 | ||
// to avoid extra properties too long. | ||
RssSparkConfig.toRssConf(sparkConf).getAll().stream() | ||
.filter(entry -> StringUtils.length((String) entry.getValue()) < 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too tricky....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Em... Thanks for your review! I'm sorry for this.
I would greatly appreciate it if you could give me some advice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can implement a white list or configuration filter. The white list can be configured in the configuration. The configuration filter could have length configuration filter. The length could be configured in the configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think use black list instead of white list can be possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok for me, too. black list
would better to be replaced by excludeList
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@jerqi Exclude config added, PTAL |
|
||
RssConf rssConf = RssSparkConfig.toRssConf(sparkConf); | ||
List<String> excludeProperties = | ||
rssConf.get(RssClientConf.RSS_CLIENT_REPORT_EXCLUDE_PROPERTIES); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default is null? Will it cause an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your tip, I set a default value.
746adcf
to
9b01609
Compare
@@ -307,6 +307,6 @@ public class RssClientConf { | |||
ConfigOptions.key("rss.client.reportExcludeProperties") | |||
.stringType() | |||
.asList() | |||
.noDefaultValue() | |||
.defaultValues("hdfs.inputPaths") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Em... this is a weird value.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2254 +/- ##
============================================
- Coverage 52.65% 52.36% -0.29%
+ Complexity 3376 2969 -407
============================================
Files 514 466 -48
Lines 27565 22326 -5239
Branches 2582 2044 -538
============================================
- Hits 14513 11691 -2822
+ Misses 12092 9890 -2202
+ Partials 960 745 -215 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
// Put all spark conf into extra properties, except which length is longer than 100 | ||
// to avoid extra properties too long. | ||
rssConf.getAll().stream() | ||
.filter(entry -> StringUtils.length((String) entry.getValue()) < 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you remove <100
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
RssConf rssConf = RssSparkConfig.toRssConf(sparkConf); | ||
List<String> excludeProperties = | ||
rssConf.get(RssClientConf.RSS_CLIENT_REPORT_EXCLUDE_PROPERTIES); | ||
// Put all spark conf into extra properties, except which length is longer than 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you remove the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -303,4 +303,10 @@ public class RssClientConf { | |||
.withDescription( | |||
"The block id manager class of server for this application, " | |||
+ "the implementation of this interface to manage the shuffle block ids"); | |||
public static final ConfigOption<List<String>> RSS_CLIENT_REPORT_EXCLUDE_PROPERTIES = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@jerqi Thanks for your review! merge it |
We forgot documents. |
@jerqi sorry for that, will add it this week. |
…while client request accessCluster (#2254) ### What changes were proposed in this pull request? put sparkConf as extra properties while client request accessCluster ### Why are the changes needed? Coordinator can let the spark application access rss cluster or not by some customized access checker leverage some of the spark configs. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No need
…port extraProperties (#2265) ### What changes were proposed in this pull request? - Add include key filter before report extraProperties - Related docs ### Why are the changes needed? - Add include key filter - Add documents. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested on our test cluster. - Client conf ``` --conf spark.shuffle.manager=org.apache.spark.shuffle.DelegationRssShuffleManager --conf spark.rss.access.id="access.id" --conf spark.rss.client.reportIncludeProperties="spark.test.key,test2.key,app.id,yarn.queue" --conf spark.rss.client.reportExcludeProperties="app.id" ``` - Coordinator_rpc_audit.log ``` [2024-11-26 21:14:45.632] cmd=accessCluster statusCode=ACCESS_DENIED from=/XXX:34733 executionTimeUs=276(<1s) appId=N/A args{accessInfo=AccessInfo{accessId='rss-test-mbl-003-bannedid', user= tdwadmin, tags=[ss_v5], extraProperties={access_info_required_shuffle_nodes_num=-1, yarn.queue=g_teg_tdw_operlog-offline, spark.test.key=123_456}}} ```
What changes were proposed in this pull request?
put sparkConf as extra properties while client request accessCluster
Why are the changes needed?
Coordinator can let the spark application access rss cluster or not by some customized access checker leverage some of the spark configs.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
No need