All entries for Tuesday 02 November 2010
November 02, 2010
SQL issue using GROUP BY and ORDER BY together
I am having an issue with SQL not behaving as I expect it to. (MySQL 5.1.41 although it shouldn’t be relevant.) I am hoping I’ve missed something really obvious but I just can’t see it at the moment, hence blog post.
I am dealing with submissions to a node with some data.
CREATE TABLE test (time int, nodeid int, data varchar(50));
DESCRIBE test;
+--------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| time | int(11) | YES | | NULL | |
| nodeid | int(11) | YES | | NULL | |
| data | varchar(50) | YES | | NULL | |
+--------+-------------+------+-----+---------+-------+
time is the primary key for purposes of this explanation
nodeid is a foreign key
data is some information associated with a submission
INSERT INTO test (time, nodeid, data) VALUES (1, 1, 'ONE');
INSERT INTO test (time, nodeid, data) VALUES (2, 2, 'TWO');
INSERT INTO test (time, nodeid, data) VALUES (3, 3, 'THREE');
INSERT INTO test (time, nodeid, data) VALUES (4, 4, 'FOUR');
INSERT INTO test (time, nodeid, data) VALUES (5, 1, 'FIVE');
INSERT INTO test (time, nodeid, data) VALUES (6, 2, 'SIX');
INSERT INTO test (time, nodeid, data) VALUES (7, 1, 'SEVEN');
INSERT INTO test (time, nodeid, data) VALUES (8, 3, 'EIGHT');
SELECT * FROM test;
+------+--------+-------+
| time | nodeid | data |
+------+--------+-------+
| 1 | 1 | ONE |
| 2 | 2 | TWO |
| 3 | 3 | THREE |
| 4 | 4 | FOUR |
| 5 | 1 | FIVE |
| 6 | 2 | SIX |
| 7 | 1 | SEVEN |
| 8 | 3 | EIGHT |
+------+--------+-------+
Node 1 has three submissions: ONE, FIVE, SEVEN
Node 2 has two submissions: TWO, SIX
Node 3 has two submissions: THREE, EIGHT
Node 4 has one submission: FOUR
We want to show the submissions, most recent first:
SELECT time, nodeid, data FROM test ORDER BY time DESC;
+------+--------+-------+
| time | nodeid | data |
+------+--------+-------+
| 8 | 3 | EIGHT |
| 7 | 1 | SEVEN |
| 6 | 2 | SIX |
| 5 | 1 | FIVE |
| 4 | 4 | FOUR |
| 3 | 3 | THREE |
| 2 | 2 | TWO |
| 1 | 1 | ONE |
+------+--------+-------+
But now we want to show only the most recent submission for each node, so we try:
SELECT time, nodeid, data FROM test GROUP BY nodeid ORDER BY time DESC;
+------+--------+-------+
| time | nodeid | data |
+------+--------+-------+
| 4 | 4 | FOUR |
| 3 | 3 | THREE |
| 2 | 2 | TWO |
| 1 | 1 | ONE |
+------+--------+-------+
That’s not what we want! The most recent submissions are 8, 7, 6 and 4.
Among other things I have tried:
SELECT time, nodeid, data FROM test GROUP BY nodeid ORDER BY MAX(time) DESC;
(returns first submissions ordered by last submission time)
SELECT MAX(time), nodeid, data FROM test GROUP BY nodeid ORDER BY MAX(time) DESC;
(returns first submission data but last submission time)
SELECT time, nodeid, data FROM test GROUP BY nodeid HAVING time = MAX(time) ORDER BY time DESC;
(returns only one result)
SELECT time, MAX(time) timemax, nodeid, data FROM test WHERE time = timemax GROUP BY nodeid ORDER BY time DESC;
(error: can’t use aggregation functions in WHERE clause)
Is there anyone who can help me out and suggest how to get the result set I need?
Alan Hazelden
Please wait - comments are loading
Loading…