pyspark测试样例

发布于:2025-05-20 ⋅ 阅读:(15) ⋅ 点赞:(0)

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, lit, concat

创建 SparkSession

spark = SparkSession.builder.appName(“SparkSQLExample”).getOrCreate()

创建 DataFrame(可以是从 CSV、JSON 等文件读取)

data = [(“Alice”, 586240, 177)] # 注意这里逗号使用的是英文逗号
columns = [“name”, “lac”, “ci”]

df = spark.createDataFrame(data, columns)

创建 CGI 列

df = df.withColumn(
“cgi”,
concat(
lit(“3-”),
(col(“lac”).cast(“integer”) * 256 + col(“ci”).cast(“integer”)).cast(“string”)
)
)

显示结果

df.show()

df = df.withColumn(
“cgi”,
concat(
lit(“3-”),
(col(“lac”).cast(“int”) * 256 + col(“ci”).cast(“int”)).cast(“string”)
)
)

显示结果

df.show()

停止 SparkSession

spark.stop()

样例2:

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, lit, concat


spark = SparkSession.builder.appName("SparkSQLExample").getOrCreate()


data = [("Alice", 586240, 177)]
columns = ["name", "lac", "ci"]
df = spark.createDataFrame(data, columns)
df = df.withColumn(
    "cgi", 
    concat(
        lit("3-"), 
        (col("lac").cast("integer") * 256 + col("ci").cast("integer")).cast("string")
    )
    )
df.show()

df = df.withColumn(
    "cgi", 
    concat(
        lit("3-"), 
        (col("lac").cast("int") * 256 + col("ci").cast("int")).cast("string")
    )
)

# 显示结果
df.show()

# 停止 SparkSession
spark.stop()


网站公告

今日签到

点亮在社区的每一天
去签到