Error message:
In Greenplum database, when we use stored procedures to query data, we will report the following error information
function cannot execute on a QE slice because it accesses relation
First, let’s understand the replication table below
Replicated table
Greenplum 6 supports a new distribution strategy: copy table, that is, the whole table has a complete copy on each node.
test=# CREATE TABLE t2 (id int) DISTRIBUTED REPLICATED; CREATE TABLE test=# INSERT INTO t2 VALUES (1), (2), (3); INSERT 0 3 test=# SELECT * FROM t2; id ---- 1 2 3 (3 rows) test=# SELECT gp_segment_id, * from t2; gp_segment_id | id ---------------+---- 0 | 1 0 | 2 0 | 3
UDF cannot access any tables on segment. Due to the characteristics of MPP, any segment only contains part of the data, so the UDF executed in the segment cannot access any tables, otherwise the data calculation is wrong.
yydzero=# CREATE FUNCTION c() RETURNS bigint AS $$ yydzero$# SELECT count(*) from t1 AS result; yydzero$# $$ LANGUAGE SQL; CREATE FUNCTION yydzero=# SELECT c(); c --- 6 (1 row) yydzero=# select c() from t2; ERROR: function cannot execute on a QE slice because it accesses relation "public.t1" (seg0 slice1 192.168.1.107:25435 pid=76589)
If the above T1 is changed to a copy table, this problem does not exist.
Avoid distributed query plans:
If the data of a table is copied on each segment, a local connection plan can be generated to avoid data movement between different nodes of the cluster. If a replicated table is used to store a table with a small amount of data (such as thousands of rows), the performance will be significantly improved. A table with a large amount of data is not suitable for the replicated table mode.
Solution:
If the reason is found, it is simple. You only need to modify the table creation statement to solve the problem
ALTER TABLE table_name SET DISTRIBUTED REPLICATED;