Writes the data from a data source to a set of tables in the Spark catalog.

ds_write_tables(ds, schema = NULL, import_mode = ImportMode$OVERWRITE)

Arguments

ds

The DataSource object.

schema

The name of the schema to write the tables to.

import_mode

The import mode to use when writing the data - "overwrite" will overwrite any existing data, "merge" will merge the new data with the existing data based on resource ID.

Value

No return value, called for side effects only.

Examples

# Create a temporary warehouse location, which will be used when we call ds_write_tables().
temp_dir_path <- tempfile()
dir.create(temp_dir_path)
sc <- sparklyr::spark_connect(master = "local[*]", config = list(
  "sparklyr.shell.conf" = c(
    paste0("spark.sql.warehouse.dir=", temp_dir_path),
    "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension",
    "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog"
  )
), version = pathling_spark_info()$spark_version)

pc <- pathling_connect(sc)
data_source <- pc %>% pathling_read_ndjson(pathling_examples('ndjson'))

# Write the data to a set of Spark tables in the 'default' database.
data_source %>% ds_write_tables("default", import_mode = ImportMode$MERGE)

pathling_disconnect(pc)
unlink(temp_dir_path, recursive = TRUE)