WHERE queries because of its internal structure. This article will seem complicated, confusing, if you have not read the first article from the cycle, where I told you how C * works. Read it, please, before proceeding to this.SELECT queries Cassandra Query Language ( CQL ) there are no familiar SQL operations JOIN , GROUP BY . And the WHERE operation is severely curtailed. In SQL, you can filter by any column, whereas in CQL only by partition keys ( partition keys ), cluster keys ( clustering columns ) and secondary indexes .Note: In C * 2.0, you can create secondaryINDEXin any column like SQL indexes. In fact, Cassandra's secondary indexes are an extra table hidden from you, so the performance ofWHEREqueries on them is worse than queries on key columns.
ad_click table created in the previous , second article. It also explains the terminology used in this article. I also recommend at least a look. CREATE TABLE ad_click ( reseller_id text, day text, -- day in the format of 'YYYY-MM-DD' time timestamp, ad_id text, amount float, PRIMARY KEY ((reseller_id, day), time, ad_id) -- (reseller_id,day) (time,ad_id) ) WITH CLUSTERING ORDER BY (time DESC); WHERE request tells Cassandra to find the node where the string is stored and pass the request there.JOIN , it is impossible to compare columns with each other. SELECT * from ad_click WHERE maxTimeuuid(day) = maxTimeuuid(3141592653589); -- ; WHERE query conditions cannot be combined with the OR operator, only AND work. And only a few comparison operators work, and even then not always. SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND -- OK day = '2013-11-29' AND -- OK time = 3141592653589 AND -- OK ad_id = '890_567_234'; -- OK SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day = '2013-11-29' AND amount = 0; -- ! . SELECT * FROM ad_click WHERE day = '2013-11-29' AND -- ! reseller_id. time = 3141592653589 AND ad_id = '890_567_234'; SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day = '2013-11-29' AND ad_id = '890_567_234'; -- ! time. SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day IN ('2013-11-28', '2013-11-29') AND -- OK time = 3141592653589 AND ad_id IN ('890_567_234', '890_567_010'); -- OK SELECT * FROM ad_click WHERE reseller_id IN ('supaboobs') AND -- ! . day = '2013-11-28' AND time = 3141592653589 AND ad_id = '890_567_234'; SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day = '2013-11-28' AND time IN (3141592653589) AND -- ! . ad_id = ('890_567_234'); SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND -- OK day = '2013-11-29'; -- OK SELECT * FROM ad_click WHERE 'supaboobs' = reseller_id AND -- ! . '2013-11-29' = day AND -- ! . 3141592653589 < time; -- ! . SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day = '2013-11-29' AND time >= 3141592653589; -- OK SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day = '2013-11-29' AND time = 3141592653589 AND ad_id > '890_567_234'; -- OK SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day = '2013-11-29' AND time >= 3141592653589 AND -- ! . ad_id = '890_567_234'; SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day = '2013-11-29' AND time >= 3141592653589 AND ad_id < '890_567_234'; -- ! . SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day = '2013-11-29' AND ad_id < '890_567_234'; -- ! time. SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day < '2013-11-29'; -- ! . ALLOW FILTERING . All other restrictions remain.IMPORTANT! Timeout in this case is very likely, because you run over all the nodes, across all rows.
SELECT * FROM ad_click WHERE time = 3141592653589 AND -- OK ad_id > '890_567_234' -- OK ALLOW FILTERING; SELECT * FROM ad_click WHERE time >= 3141592653589 AND -- OK time <= 3141592653589 -- OK ALLOW FILTERING; SELECT * FROM ad_click WHERE time >= 3141592653589 AND ad_id > '890_567_234' -- ! . ALLOW FILTERING; SELECT * FROM ad_click WHERE time >= 3141592653589 AND time <= 3241592653589 AND ad_id = '890_567_234' -- ! . ALLOW FILTERING; CREATE INDEX on ad_click (amount); SELECT * FROM ad_click WHERE amount = 0.0075; -- OK SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day = '2013-11-29' AND time = 3141592653589 AND ad_id = '890_567_234' AND amount = 0.0075; -- OK SELECT * FROM ad_click WHERE reseller_id = 'supaboobs' AND day = '2013-11-29' AND time = 3141592653589 AND ad_id = '890_567_234' AND amount > 0.0; -- ! . Previous article of the cycle .
Source: https://habr.com/ru/post/205176/
All Articles