Skip to main content

Scala Spark

Currently, users can run dynamic decryption queries on Text or Sequence files in Scala Spark. This section details the steps necessary to set up decryption libraries for Scala Spark.

Configuration

spark-shell --driver-class-path /PathToJar/DgDecrypter.jar:/PathToJar/gson-2.2.2.jar:/PathToJar/bcpg-jdk15on-150.jar:/PathToJar/bcprov-ext-jdk15on-150.jar

To add above jars to spark environment, please change “/PathToJar/” to your path to the above jars in your local system.

The dgSecure.properties file accompanying the DgDecrypter library has the same content as in the Hive case above.Type the following commands in command line:

#Import decrypter methods from jars
import com.dataguise.decrypterlib.java.decrypter.Decrypter
import com.dataguise.decrypterlib.java.decrypter.GeneralPreHook
import com.dataguise.decrypterlib.java.decrypter.GenerateConfParams

There are two ways to generate parameters:

  1. Without using hadoop conf (In case standalone Spark without Hadoop connected/configured)
    val params = GenerateConfParams.generateParams()

  2. Using hadoop conf (Call pre-hook by sc.hadoopConfiguration)

  3. GeneralPreHook.config(sc.hadoopConfiguration)

  4. val params = sc.hadoopConfiguration.get("params")

Usage Sample A (Single String)

Type the following commands in command line:

CODE
#Import decrypter method from jars
import com.dataguise.decrypterlib.java.decrypter.Decrypter

#Import pre-hook from jars
import com.dataguise.decrypterlib.java.decrypter.GeneralPreHook

#Call pre-hook, generate parameters using hadoop conf
GeneralPreHook.config(sc.hadoopConfiguration)

#Create input encrypted string
val input = "!260dg_32&:16540104CQ==/cDtCnWgT53VfyLPqznbKRa0cM748i7ZMLWeOc614ZI=!"

#Decrypt the encrypted string
val output = Decrypter.decryptString(input,sc.hadoopConfiguration.get("params"))

#Print result
println(output)

Usage Sample B (MapReduce)

Type the following commands in command line:

CODE
#Import decrypter method from jars
import com.dataguise.decrypterlib.java.decrypter.Decrypter

#Import generate configuration paramters
import com.dataguise.decrypterlib.java.decrypter.GenerateConfParams

#Add required jar files to distributed cache which can be used by slave nodes
sc.addJar("/opt/test/DgDecrypter.jar")
sc.addJar("/opt/jars/bcpg-jdk15on-150.jar")
sc.addJar("/opt/jars/bcprov-ext-jdk15on-150.jar")

#Create a RDD of the input file(file from hdfs)
val file = sc.textFile("/Fan/text/StructuredData-I.txt")

#Create wordcount mapreduce
val counts = file.flatMap(line => line.split(" ")).map(word => (com.dataguise.decrypterlib.java.decrypter.Decrypter.decryptString(word, params), 1)).reduceByKey(_ + _)

#Alternatively, we can create just mapper to decrypt each input
val counts = file.flatMap(line => com.dataguise.decrypterlib.java.decrypter.Decrypter.decryptString(line.split(" "), params)).map(word => word)

#Save results
counts.saveAsTextFile("/Fan/result/task1")

#Alternatively, we can display the results directly
counts.collect()
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.