scala - Spark RDD Example failed to compile - Stack Overflow

IT技术

更新时间：2025-04-251

admin管理员组
文章数量:1431709

I tried getting Spark's Spark RDD Example (compute the count of each word in a text file) to work to no avail.
I'm new to Scala and Spark, so I'm trying to run this sample program on an already-setup machine that came with Spark.
I merged the boilerplate Java-like class code from a working example provided by a professor that did a different task, got rid of the program-specific stuff, and inserted the Spark RDD Example code.

Source of failing code: .html ... alternative link for if previous link is broken

// Program assumes that
// 1) NO folder named "output" exists in the same directory before execution, and
// 2) the file "some_words.txt" already exists in the same directory before execution

// Calculates the frequency of words that occur in a document, outputting duples like "(Anthill, 3)"
import .apache.spark.SparkContext
import .apache.spark.SparkContext._
import .apache.spark.SparkConf

object SimpleApp {
  def main(args: Array[String]) {
    val fileToRead = "input.txt" 
    val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext(conf)

    text_file = spark.sparkContext.textFile("some_words.txt")
    counts = (
      text_file.flatMap(lambda line: line.split(" "))
      .map(lambda word: (word, 1))
      .reduceByKey(lambda a, b: a + b)
    )
    println(counts)
    
    // Folder to save to
    wordsWithCount.saveAsTextFile("output")
    sc.stop()
  }
}

Errors:

')' expected but '(' found.
[error]         fileData.flatMap(lambda line: line.split(" "))
                                                        ^

identifier expected but integer literal found.
[error]         .map(lambda word: (word,1))
                                        ^

')' expected but '}' found.
[error]   }

My Makefile/Shellscript contains

sbt package
spark-submit --master local[4] target/scala-2.11/simple-project_2.11-1.0.jar

I tried getting Spark's Spark RDD Example (compute the count of each word in a text file) to work to no avail.
I'm new to Scala and Spark, so I'm trying to run this sample program on an already-setup machine that came with Spark.
I merged the boilerplate Java-like class code from a working example provided by a professor that did a different task, got rid of the program-specific stuff, and inserted the Spark RDD Example code.

Source of failing code: https://spark.apache./examples.html ... alternative link for if previous link is broken

// Program assumes that
// 1) NO folder named "output" exists in the same directory before execution, and
// 2) the file "some_words.txt" already exists in the same directory before execution

// Calculates the frequency of words that occur in a document, outputting duples like "(Anthill, 3)"
import .apache.spark.SparkContext
import .apache.spark.SparkContext._
import .apache.spark.SparkConf

object SimpleApp {
  def main(args: Array[String]) {
    val fileToRead = "input.txt" 
    val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext(conf)

    text_file = spark.sparkContext.textFile("some_words.txt")
    counts = (
      text_file.flatMap(lambda line: line.split(" "))
      .map(lambda word: (word, 1))
      .reduceByKey(lambda a, b: a + b)
    )
    println(counts)
    
    // Folder to save to
    wordsWithCount.saveAsTextFile("output")
    sc.stop()
  }
}

Errors:

')' expected but '(' found.
[error]         fileData.flatMap(lambda line: line.split(" "))
                                                        ^

identifier expected but integer literal found.
[error]         .map(lambda word: (word,1))
                                        ^

')' expected but '}' found.
[error]   }

My Makefile/Shellscript contains

sbt package
spark-submit --master local[4] target/scala-2.11/simple-project_2.11-1.0.jar

Share Improve this question asked Nov 19, 2024 at 7:21 Stev 1112 silver badges9 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 2

The problem is that you're using a different language than what is provided by the example.
You're also mixing two different languages in the same program, as indicated by you sometimes using val and other times not using any variable indicator at all (aside from the variable name). Of course, since you're new to Scala, you may not recognize the variable declaration types.
The website is using Python Spark (PySpark?), which the website doesn't tell you.

Easy way to tell which language you're using: The lambda statements (i.e., anonymous functions) look different.
(Some of the following language examples are from RDD Operations)

Python:

func(lambda word: (word, 1))
func(lambda a,b: a+b)

Java:

I don't know the "input element, return tuple" conversion for Java.
func((a,b) -> a+b)

Scala:

func(word => (word,1))
func((a,b) => a+b)

, where "func" is just some function supported by Spark like map
, and where "word", "a", and "b" are all just function inputs (and part of the outputs too).

Working example when using Scala Spark:

import .apache.spark.SparkContext
import .apache.spark.SparkContext._
import .apache.spark.SparkConf

object SimpleApp {
  def main(args: Array[String]) {
    val fileToRead = "input.txt" 
    val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext(conf)

    val fileData = sc.textFile(fileToRead)
    val wordsWithCount = (
      fileData.flatMap(line => line.split(" "))
      .map(word => (word,1))
      .reduceByKey((a,b) => a+b)
    )
    println(wordsWithCount)
    
    // Folder to save to
    wordsWithCount.saveAsTextFile("output")
    sc.stop()
  }
}

本文标签： scalaSpark RDD Example failed to compileStack Overflow

版权声明：本文标题：scala - Spark RDD Example failed to compile - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1745578331a2664469.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

scala - Spark RDD Example failed to compile - Stack Overflow

1 Answer 1

更多相关文章

scala - Spark RDD Example failed to compile - Stack Overflow

发表评论

推荐文章

javascript - Add inline CSS in the image using TinyMCE - Stack Overflow

Toggle color on button javascript - Stack Overflow

javascript - Textfields component number validation React js - Stack Overflow

List of available events for wp.media

c - How to scale a 10 bit analog signal into an 8 bit PWM? - Stack Overflow

热门文章

filters - Add PHP code after title in single post pages?

javascript - Ajax call after page load - Stack Overflow

javascript - is there a way to hold the values? - lost in postback - Stack Overflow

javascript - How to read headers in bodyresponse promise - Stack Overflow

javascript - Phonegap canvas low refresh rate - Stack Overflow

Rails 8 Devise and OAuth Issue: Login Page Reappears After Back Button, Raises InvalidAuthenticityToken - Stack Overflow

Is it possible to parse JSON with javascript? - Stack Overflow

javascript - Uncaught RangeError: Maximum call stack size exceeded - Bootstrap Validate - Stack Overflow

javascript - Is Vue.js Incompatible With Serving POST Requests? - Stack Overflow

javascript - How do I check when multiple chai-http requests are really done in a mocha before block? - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

javascript - jQuery DataTables - custom filter for column that contains text field - Stack Overflow

html - innerHTML - how to assign value and display to div id in javascript - Stack Overflow

metabox - Custom taxonomy hide meta box but show in menu

jquery - Javascript function call from HTML Table Cells - Stack Overflow

Refresh button for an iframe jquery or javascript - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价