Install latest Sparkling Water Version

sparkling water h2o
python sparkling water
sparkling water download
sparkling water example
rsparkling
sparkling water kubernetes
h2o install
h2o jar

I am following the installation guide for Sparkling Water but it does not work at all. It consists of 8 steps as you can see in: rsparkling

  • First problem from Step 2 install an old version of sparklyr (not compatible with Spark 2.3.1), solved using install.packages("https://github.com/rstudio/sparklyr/archive/v0.8.0.tar.gz", repos = NULL, type="source")
  • Step 3, version 2.3.1 of Spark is not available as shown by the command sparklyr::spark_available_versions() #2.3.0. Solved installing directly from the page Apache Spark.
  • Step 6 does not work, install an unsupported version of rsparkling with h2o, packageVersion("h2o") #'3.21.0.4359'

I'm trying to do the following, download the latest version of sparkling water, unzip the file. And use the following code:

install.packages("C:/Users/USER/Downloads/sparkling-water-2.3.259_nightly/rsparkling.tar.gz", repos=NULL, type="source")
* installing *source* package 'rsparkling' ...
** package 'rsparkling' successfully unpacked and MD5 sums checked
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (rsparkling)
In R CMD INSTALL

Up to here everything seems fine.

options(rsparkling.sparklingwater.version = "2.3.259_nightly")
library(rsparkling)
# 7. Connect to Spark
sc <- sparklyr::spark_connect(master = "local")
Error: invalid version specification ‘2.3.259_nightly’

Error: invalid version specification ‘2.3.259_nightly’

Note: Download Sparkling Water Nightly Bleeding Edge version. The packages h2o, SparkR, sparklyr and the connections work correctly on windows 7 R version 3.4.4, I only have problems with rsparkling.

system('spark-submit --version')
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.1
      /_/

Using Scala version 2.11.8, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_151
Branch 
Compiled by user vanzin on 2018-06-01T20:37:04Z

As I can solve this problem, I have installed the appropriate version of rsparkling, compatible with the latest version of h2o.

Edit question: Well Lauren thanks for the links, now I'm working with the latest stable version of h2o 3.20.0.5 and spparkling water. But apparently I think the problem will not be with the rsparkling package, but with the sparklyr package, as the last version of apache spark 2.3.1 was released (Jun 08 2018), while the latest update of sparklyr 0.8.4 was (May 25 2018) that is, it was launched a month earlier (spark 2.3.1 did not exist). Therefore the command:

spark_available_versions()
   spark
1  1.6.3
2  1.6.2
3  1.6.1
4  1.6.0
5  2.0.0
6  2.0.1
7  2.0.2
8  2.1.0
9  2.1.1
10 2.2.0
11 2.2.1
12 2.3.0

# Set spark connection
sc <- spark_connect(master = "local", version = "2.3.1") #It does not work
Error in spark_install_find(version, hadoop_version, latest = FALSE, hint = TRUE) : 
Spark version not installed. To install, use spark_install(version = "2.3.1")
spark_install(version = "2.3.1")
Error in spark_install_find(version, hadoop_version, installed_only = FALSE,  : 
Spark version not available. Find available versions, using spark_available_versions()
sc <- spark_connect(master = "local") #it works perfectly

I think the solution will be waiting for sparklyr 0.9.0

The nightly downloads page is meant to work for simple environments, and is not meant to capture all possible configurations.

However, since this question is specific to Windows, you can find documentation on how to Use Sparkling Water in Windows Environments here and how to Use Rsparkling in Windows Environments here (note these are for the latest stable, but the instructions should be similar for the nightly release).

RSparkling, Installation of SparklyR & Spark¶. Install sparklyr¶. We recommend the latest stable version of sparklyr. install.packages(  Get started with Sparkling Water in a few easy steps. 1. Download Spark (if not already installed) from the Spark Downloads Page. Choose Spark release : 2.4.* except 2.4.2 Choose a package type: Pre-built for Hadoop 2.7 and later 2. Point SPARK_HOME to the existing installation of Spark and export variable MASTER.

first install last version sparklyr and connect to Spark

library(sparklyr)
spark_install(version = "2.3.2")
sc <- spark_connect(master = "local", version = "2.3.2")

Install H2O of correct version:

install.packages("h2o", type = "source", repos = "https://h2o-release.s3.amazonaws.com/h2o/rel-wright/10/R")
packageVersion("h2o")
[1] ‘3.20.0.10’

Verify the compatibility of sparkling water with h2o

rsparkling::h2o_release_table()[1:5,]
   Spark_Version Sparkling_Water_Version H2O_Version H2O_Release_Name H2O_Release_Patch_Number
1            2.3                  2.3.16   3.20.0.10       rel-wright                       10
17           2.3                  2.3.15    3.20.0.9       rel-wright                        9
16           2.3                  2.3.14    3.20.0.8       rel-wright                        8
15           2.3                  2.3.13    3.20.0.7       rel-wright                        7
14           2.3                  2.3.12    3.20.0.6       rel-wright                        6

Set Sparkling Water version to be used with RSparkling

options(rsparkling.sparklingwater.version = "2.3.16")
library(rsparkling)

Now, H2OContext is available and we can use any H2O features available in R. h2o_context(sc) org.apache.spark.h2o.H2OContext

Sparkling Water Context:
 * H2O name: sparkling-water-USER_local-1539839100465
 * cluster size: 1
 * list of used nodes:
  (executorId, host, port)
  ------------------------
  (driver,127.0.0.1,54321)
  ------------------------

  Open H2O Flow in browser: http://127.0.0.1:54321 (CMD + click in Mac OSX)

h2o_flow(sc)

Initialize Spark UI

Now the integration of Spark with H2O through Sparkling Water works perfectly.

Install latest Sparkling Water Version, The nightly downloads page is meant to work for simple environments, and is not meant to capture all possible configurations. However, since  Get started with Sparkling Water in a few easy steps. 1. Download Spark (if not already installed) from the Spark Downloads Page. Choose Spark release : 2.4.0 Choose a package type: Pre-built for Hadoop 2.7 and later 2. Point SPARK_HOME to the existing installation of Spark and export variable MASTER.

Although your question was specific to Windows, you may want to try this solution that worked fine in Mac.

Download Sparkling Water 2.3.2, Get started with Sparkling Water in a few easy steps. 1. Download Spark (if not already installed) from the Spark Downloads Page. Choose Spark release : 2.3.0 Get started with Sparkling Water in a few easy steps. 1. Download Spark (if not already installed) from the Spark Downloads Page. Choose Spark release : 2.2.1 Choose a package type: Pre-built for Hadoop 2.4 and later 2. Point SPARK_HOME to the existing installation of Spark and export variable MASTER.

Download Sparkling Water 2.3.0, Get started with Sparkling Water in a few easy steps. 1. Download Spark (if not already installed) from the Spark Downloads Page. Choose Spark release : 2.3.0 Get started with Sparkling Water in a few easy steps. 1. Download Spark (if not already installed) from the Spark Downloads Page. Choose Spark release : 2.3.0 Choose a package type: Pre-built for Hadoop 2.4 and later 2. Point SPARK_HOME to the existing installation of Spark and export variable MASTER.

h2oai/sparkling-water: Sparkling Water provides H2O , Sparkling Water provides H2O functionality inside Spark cluster Latest commit by jakubhava about 1 hour ago RELEASE-3.30.0.2-1 published 1 day ago H2OFrame Opened by jakubhava 8 days ago #2020 [SW-2155] Upgrade H2O to​  Sparkling Water H2O open source integration with Spark. Enterprise Platforms; H2O Driverless AI The automatic machine learning platform. H2O Q Make your own AI apps. Enterprise Support Get help and technology from the experts in H2O and access to Enterprise Steam. Enterprise Puddle Find out about machine learning in any cloud and H2O.ai Enterprise Puddle

h2oai/rsparkling: RSparkling: Use H2O Sparkling Water , RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning) - h2oai/rsparkling. status checks… Latest commit 253b676 on Jul 19, 2018  What is Sparkling Water?¶ Sparkling Water allows users to combine the fast, scalable machine learning algorithms of H2O with the capabilities of Spark. With Sparkling Water, users can drive computation from Scala/R/Python and utilize the H2O Flow UI, providing an ideal machine learning platform for application developers.