Skip to content

[TPC-H] query 8 produces different results after conversion to/from Substrait #954

@bestbeforetoday

Description

@bestbeforetoday

Using Isthmus to convert reference TPC-H SQL to Substrait and then back to SQL, the generated SQL produces different query results to the reference SQL (at least in PostgreSQL).

PostgreSqlIntegrationTest > testTpcH(int) > [8] 8 FAILED
    org.opentest4j.AssertionFailedError: Reference and generated SQL produce 2 different results.

    Reference SQL:
    select
      "O_YEAR",
      sum(case
        when "NATION" = 'EGYPT' then "VOLUME"
        else 0
      end) / sum("VOLUME") as "MKT_SHARE"
    from
      (
        select
          extract(year from "O"."O_ORDERDATE") as "O_YEAR",
          "L"."L_EXTENDEDPRICE" * (1 - "L"."L_DISCOUNT") as "VOLUME",
          "N2"."N_NAME" as "NATION"
        from
          "PART" "P",
          "SUPPLIER" "S",
          "LINEITEM" "L",
          "ORDERS" "O",
          "CUSTOMER" "C",
          "NATION" "N1",
          "NATION" "N2",
          "REGION" "R"
        where
          "P"."P_PARTKEY" = "L"."L_PARTKEY"
          and "S"."S_SUPPKEY" = "L"."L_SUPPKEY"
          and "L"."L_ORDERKEY" = "O"."O_ORDERKEY"
          and "O"."O_CUSTKEY" = "C"."C_CUSTKEY"
          and "C"."C_NATIONKEY" = "N1"."N_NATIONKEY"
          and "N1"."N_REGIONKEY" = "R"."R_REGIONKEY"
          and "R"."R_NAME" = 'MIDDLE EAST'
          and "S"."S_NATIONKEY" = "N2"."N_NATIONKEY"
          and "O"."O_ORDERDATE" between date '1995-01-01' and date '1996-12-31'
          and "P"."P_TYPE" = 'PROMO BRUSHED COPPER'
      ) as "ALL_NATIONS"
    group by
      "O_YEAR"
    order by
      "O_YEAR"

    Generated SQL:
    SELECT "t3"."$f600" AS "O_YEAR", "t3"."$f4" AS "MKT_SHARE"
    FROM (SELECT EXTRACT(YEAR FROM "ORDERS"."O_ORDERDATE") AS "$f600", SUM(CAST(CASE WHEN CAST("NATION0"."N_NAME" AS VARCHAR(25)) = 'EGYPT' THEN "LINEITEM"."L_EXTENDEDPRICE" * (1 - "LINEITEM"."L_DISCOUNT") ELSE 0 END AS DECIMAL(19, 0))) / SUM("LINEITEM"."L_EXTENDEDPRICE" * (1 - "LINEITEM"."L_DISCOUNT")) AS "$f4"
    FROM "PART",
    "SUPPLIER",
    "LINEITEM",
    "ORDERS",
    "CUSTOMER",
    "NATION",
    "NATION" AS "NATION0",
    "REGION"
    WHERE "PART"."P_PARTKEY" = "LINEITEM"."L_PARTKEY" AND "SUPPLIER"."S_SUPPKEY" = "LINEITEM"."L_SUPPKEY" AND ("LINEITEM"."L_ORDERKEY" = "ORDERS"."O_ORDERKEY" AND ("ORDERS"."O_CUSTKEY" = "CUSTOMER"."C_CUSTKEY" AND "CUSTOMER"."C_NATIONKEY" = "NATION"."N_NATIONKEY")) AND ("NATION"."N_REGIONKEY" = "REGION"."R_REGIONKEY" AND CAST("REGION"."R_NAME" AS VARCHAR(25)) = 'MIDDLE EAST' AND ("SUPPLIER"."S_NATIONKEY" = "NATION0"."N_NATIONKEY" AND ("ORDERS"."O_ORDERDATE" >= DATE '1995-01-01' AND "ORDERS"."O_ORDERDATE" <= DATE '1996-12-31' AND "PART"."P_TYPE" = 'PROMO BRUSHED COPPER')))
    GROUP BY EXTRACT(YEAR FROM "ORDERS"."O_ORDERDATE")
    ORDER BY 1) AS "t3"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions