Quantcast
Channel: How do I calculate percentages over groups in spark? - Stack Overflow
Viewing all articles
Browse latest Browse all 3

Answer by Vihit Shah for How do I calculate percentages over groups in spark?

$
0
0

Yes. You are right when you say that you need to use windowed analytical functions.Please find below the solutions to your queries.

Hope it helps!

spark.read.option("header","true").option("delimiter","|").csv("****").withColumn("fundTotal",sum("QTY").over(Window.partitionBy("FUND"))).withColumn("QTY%",sum("QTY").over(Window.partitionBy("BROKER"))).select('FUND,'BROKER,(($"QTY%"*100)/'fundTotal).as("QTY%")).distinct.show

And the second!

spark.read.option("header","true").option("delimiter","|").csv("/vihit/data.csv").withColumn("QTY%",sum("QTY").over(Window.partitionBy("BROKER"))).select('FUND,'BROKER,(('QTY*100)/$"QTY%").as("QTY%")).distinct.show

Viewing all articles
Browse latest Browse all 3

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>