With multi-dimensional model as the core, let the factory digital transformation and upgrading “within reach”>>>
Problem description
Execute the following SQL in hive:
select av_seq, cp_seq from dw.smbrandcp group by av_seq;
Prompt error:
FAILED: SemanticException [Error 10025]: Line 1:15 Expression not in GROUP BY key 'cp_seq'
Error analysis
Obviously, the error message has told us that this is a syntax error. When using the group by clause, the select statement can only contain fields in the group by clause. Of course, there can be multiple aggregate functions (such as count) in the select statement
In this example, the smbrandcp table is roughly as follows:
av_ seq | cp_ seq |
---|---|
944076 | CC200940 |
944076 | CC200743 |
944076 | CC200777 |
When executing the following query:
select av_seq, cp_seq from dw.smbrandcp group by av_seq;
Because it’s only in AV_ If you do aggregation on the SEQ field, then for the aggregation operation AV_ seq = 944076, cp_ What value should SEQ take?This creates a variety of possible options. Hive’s way to deal with this kind of SQL whose field is in the select statement but not in the group by clause is to treat it as an invalid SQL, because what HQL wants is precision
References
Hive query and group by clause