update builder (Docker and package files for better java support)#113
update builder (Docker and package files for better java support)#113michaelmayer2 wants to merge 2 commits into
Conversation
| --enable-prebuilt-html" | ||
|
|
||
| # Make sure that patching the OS does not break Java | ||
| ENV JAVA_HOME=/usr/lib/jvm/jre-11-openjdk |
There was a problem hiding this comment.
While working on RHEL 9 support and trying to incorporate these improvements there, I learned that the default R from EPEL/Fedora uses a completely unversioned path that's managed by the alternatives system:
R CMD config JAVA_HOME
# /usr/lib/jvm/jre
# To change the major version of Java at /usr/lib/jvm/jre:
update-alternatives --config javaFrom quick testing, using JAVA_HOME=/usr/lib/jvm/jre seems to support even major Java version switches without having to run R CMD javareconf. Switching from Java 11 to 17 seemed to just work, although switching from Java 11/17 to 8 still required a R CMD javareconf since some paths changed from 8 to 11:
# Java 8
JAVA_LIBS = -L$(JAVA_HOME)/lib/amd64/server -ljvm
JAVA_LD_LIBRARY_PATH = $(JAVA_HOME)/lib/amd64/server
# Java 11/17
JAVA_LIBS = -L$(JAVA_HOME)/lib/server -ljvm
JAVA_LD_LIBRARY_PATH = $(JAVA_HOME)/lib/serverWhat do you think of using this completely unversioned alternatives path? It doesn't work for all major version switches, but would at least accommodate users going from 11 to 17. And would also match EPEL/Fedora R's behavior.
There was a problem hiding this comment.
I found similar behavior with the Java packages on openSUSE/SLES 15.3. The path they use there is /usr/lib64/jvm:
zypper -n install java-11-openjdk-devel java-1_8_0-openjdk-devel
ls /usr/lib64/jvm -la
# lrwxrwxrwx 1 root root 26 Aug 8 23:17 java -> /etc/alternatives/java_sdk
# lrwxrwxrwx 1 root root 29 Aug 8 23:17 java-11 -> /etc/alternatives/java_sdk_11
# lrwxrwxrwx 1 root root 18 Apr 21 08:49 java-11-openjdk -> java-11-openjdk-11
# drwxr-xr-x 7 root root 4096 Aug 8 23:17 java-11-openjdk-11
# lrwxrwxrwx 1 root root 34 Aug 8 23:17 java-openjdk -> /etc/alternatives/java_sdk_openjdk
# lrwxrwxrwx 1 root root 21 Aug 8 23:17 jre -> /etc/alternatives/jre
# lrwxrwxrwx 1 root root 24 Aug 8 23:17 jre-11 -> /etc/alternatives/jre_11
# lrwxrwxrwx 1 root root 18 Apr 21 08:49 jre-11-openjdk -> java-11-openjdk-11
# lrwxrwxrwx 1 root root 29 Aug 8 23:17 jre-openjdk -> /etc/alternatives/jre_openjdk
/usr/lib64/jvm/jre/bin/java -version
# openjdk version "11.0.15" 2022-04-19
update-alternatives --config java
/usr/lib64/jvm/jre/bin/java -version
# openjdk version "1.8.0_332"There was a problem hiding this comment.
I would suggest we keep the version number encoded in JAVA_HOME, e.g. /usr/lib/jvm/jre-11-openjdk. As you point out, moving between java 11 and 17 seem to work but not when moving to 8 (not without a javareconf).
The idea of this PR is to make Java integration a bit more seamless. I am not sure if we are creating more issues by moving to the extremely generic path compared to the issues that we are trying to solve.
| RUN gem install fpm:1.11.0 | ||
|
|
||
| # Make sure that patching the OS does not break Java | ||
| ENV JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/ |
There was a problem hiding this comment.
On Ubuntu/Debian, the R builds here appear to already use this major-versioned location:
R CMD config JAVA_HOME
# /usr/lib/jvm/java-11-openjdk-amd64Checking the default Ubuntu/Debian R packages, they appear to all use /usr/lib/jvm/default-java from default-jdk or default-jre:
R CMD config JAVA_HOME
# /usr/lib/jvm/default-javaUnfortunately it doesn't look like you can control the JRE location using the alternatives system in Ubuntu/Debian. /usr/lib/jvm/default-java is put there by default-jdk/default-jre, and not present if you install Java 11 or Java 8 specifically.
ls /usr/lib/jvm -la
# lrwxrwxrwx 1 root root 25 Jul 17 2019 default-java -> java-1.11.0-openjdk-amd64
# lrwxrwxrwx 1 root root 21 Jul 20 22:37 java-1.11.0-openjdk-amd64 -> java-11-openjdk-amd64
# lrwxrwxrwx 1 root root 21 Jul 22 21:02 java-1.17.0-openjdk-amd64 -> java-17-openjdk-amd64
# drwxr-xr-x 9 root root 4096 Aug 8 23:34 java-11-openjdk-amd64
# drwxr-xr-x 9 root root 4096 Aug 8 23:34 java-17-openjdk-amd64
# drwxr-xr-x 2 root root 4096 Aug 8 23:34 openjdk-11
# drwxr-xr-x 2 root root 4096 Aug 8 23:34 openjdk-17But still, maybe it's better to use /usr/lib/jvm/default-java anyway to match the distro default R and because it's common for users to install default-jdk (not sure, might be wrong about this)? At least for RSPM, default-jdk is what we've been using for instructions on installing Java.
| -d xz-devel \ | ||
| -d zip \ | ||
| -d zlib-devel \ | ||
| -d java-11-openjdk-devel \ |
There was a problem hiding this comment.
Since Java is often an optional dependency and somewhat heavy, could we leave it out of the DEB/RPM package dependencies? That would better support users who want to switch between Java 8, 11, 17, etc. as well. I think the default R packages from the various distros also leave Java as an optional dependency.
There was a problem hiding this comment.
We can leave java out but why do we then spend effort on configuring Java in the first place ?
There was a problem hiding this comment.
I'm not sure if there was a specific reason for configuring Java in the first place, and would guess that it was just borrowed from how Debian and Fedora packaged R. It is nice to be able to just install Java and have it work out of the box without R CMD javareconf. But I still do think Java is too heavy to add as a DEB/RPM dependency since these are billed as "minimal" R builds and Java is an optional dependency. It'll add a few hundred MBs to many consumers who don't necessarily need Java, like the RStudio R Docker images, the RStudio Pro Product images, the GitHub Actions for R, and others. And those on Java 8 or 17 will have an extra unused installation of Java.
There was a problem hiding this comment.
Ok, I definitely agree with your sentiment about the size of the R build if we go with the Java integration. So far I only thought that this was an "opinionated R build" but not a "minimal build". Once probably could segregate the Java related things into a separate DEB/RPM but then the installation process becomes too complex again.
For now we can always refer to https://solutions.rstudio.com/r/using-rjava/ as a reference document for users wishing to use Java with R.
The idea of the PR was to make our R builds a bit more holistic and consistent but based on the arguments made above I agree it may be one or two steps too much in the wrong direction.
Let's make sure we can get OpenBLAS into Ubuntu 20.04 R builds and then we should close this PR.
There was a problem hiding this comment.
Yeah, it's kind of both opinionated and minimal. We do some opinionated things that aren't completely minimal, like including OpenBLAS and the -devel packages of a few libraries that are required to install packages that link against the R shared library (which would only be included in r-base-dev, not r-base for example). A separate DEB/RPM that includes Java would be nice, but agreed that it'd complicate installation. Maybe a middle ground would be to provide users with a preconfigured Java where possible, like in the various Docker images. That article also looks great, and I'll try to reference it when any Java configuration questions come up.
Do you still want to explicitly set JAVA_HOME to at least make a first time Java installation more seamless? I've done that with the RHEL 9 R builds: https://github.com/rstudio/r-builds/pull/133/files#diff-7c6c557e074afff5ac851dbf63d384a5deaea84626a0f6945f41767510c08c39R77-R83. I think it's still better than letting R default it to something like /usr/lib/jvm/java-11-openjdk-11.0.16.0.8-1.el9_0.x86_64.
There was a problem hiding this comment.
Yes, setting JAVA_HOME for the RHEL9 builds to the more generic folder I would strongly recommend. Also agree with the notion of extra DEB/RPM packages for Java where we even could be in the business of providing packages for any JDK offered, e.g. 1.8.0, 11, 17, ...
|
@michaelmayer2 I went ahead and resolved the merge conflicts, just clearing out some of the removed EOL distros. |
This pull request will add better java support to our opinionated R builds.
Main issue is that the build of R from sources will auto-detect the java version present on the system. This is great to start with. The problem however that it detects the JAVA_HOME and sets it to the physically existing path. Such a path not only contains the major version of Java but also any patch level or minor release versions as defined by the linux distro. In this pull request all the files in the directory builder are updated to set JAVA_HOME to a folder that is independent of any security patches or updates applied to the underlying linux distro.
Accepting this pull request will fix any issue that leads to the urban myth that using Java in R is difficult.
An article on solutions.rstudio.com is in preparation https://github.com/rstudio/solutions.rstudio.com/pull/210