process from multiple source
Less than 1 minute
process from multiple source
prepare
- prepare MergeTree table named
yellow_trip
- prepare s3 table named
fhvhv_trip
- prepare postgres table named
taxi_zone_lookup
interact with clickhouse
- union
yellow_trip
andfhvhv_trip
, and then join withtaxi_zone_lookup
podman run --rm \ --entrypoint clickhouse-client \ -it docker.io/clickhouse/clickhouse-server:23.11.5.29-alpine \ --host host.containers.internal \ --port 19000 \ --user ben \ --password 123456 \ --query "SELECT pickup_datetime, dropoff_datetime, PULocationID, borough, zone, service_zone FROM ( SELECT pickup_datetime, dropoff_datetime, PULocationID FROM fhvhv_trip UNION ALL ( SELECT tpep_pickup_datetime as pickup_datetime, tpep_dropoff_datetime as dropoff_datetime, PULocationID FROM yellow_trip ) ) join_table LEFT JOIN postgresql_tlc.taxi_zone_lookup AS zone_lookup ON join_table.PULocationID = zone_lookup.locationid LIMIT 10"