Why Secondary Indexes?
By default, Cassandra only allows efficient queries on partition key and clustering columns. Secondary indexes let you query on non-key columns without using ALLOW FILTERING. However, they come with significant trade-offs and should be used carefully.
Creating a Secondary Index
The CREATE INDEX statement adds an index on a non-key column. The index is stored locally on each node, meaning queries on indexed columns must fan out to every node in the cluster.
-- Create a secondary index on email
CREATE INDEX idx_users_email ON users (email);
-- Now you can query by email
SELECT * FROM users WHERE email = 'alice@example.com';
-- Index on a collection column (values)
CREATE INDEX idx_article_tags ON articles (VALUES(tags));
-- Index on map keys
CREATE INDEX idx_prefs_keys ON user_profiles (KEYS(preferences));
-- Index on map entries
CREATE INDEX idx_prefs_entries ON user_profiles (ENTRIES(preferences));When Secondary Indexes Work Well
Secondary indexes perform best under specific conditions. They are acceptable when combined with the partition key, and when the indexed column has moderate cardinality.
-- Good: index query combined with partition key
-- Only hits one node's local index
SELECT * FROM orders
WHERE customer_id = 'cust-123'
AND status = 'shipped';
-- Acceptable: moderate cardinality column
-- e.g., status with ~5 distinct values across millions of rows
CREATE INDEX idx_orders_status ON orders (status);When to Avoid Secondary Indexes
Avoid secondary indexes on high-cardinality columns (like email or UUID) when querying without the partition key, because the query must contact every node. Also avoid them on very low cardinality columns (like boolean) as a large percentage of rows will match.
- High-cardinality columns without partition key: fan-out to all nodes is expensive.
- Very low cardinality (boolean): returns too many rows, not selective enough.
- Frequently updated columns: index maintenance adds write overhead.
- Tables with heavy delete patterns: tombstones in the index degrade performance.
Materialized Views as an Alternative
Instead of secondary indexes, you can create a materialized view that duplicates data with a different primary key. This is the query-driven approach to data modeling that Cassandra favors.
-- Create a materialized view for querying users by email
CREATE MATERIALIZED VIEW users_by_email AS
SELECT * FROM users
WHERE email IS NOT NULL AND user_id IS NOT NULL
PRIMARY KEY (email, user_id);
-- Now query efficiently by email
SELECT * FROM users_by_email
WHERE email = 'alice@example.com';Materialized views are maintained automatically by Cassandra but add write amplification. The most common and recommended alternative is to manually maintain a denormalized lookup table.
Managing Indexes
-- Drop an index
DROP INDEX IF EXISTS idx_users_email;
-- View existing indexes
DESC INDEX idx_users_email;Key Takeaways
- Secondary indexes are local to each node -- queries without a partition key fan out to all nodes.
- They work best on moderate-cardinality columns combined with the partition key.
- Avoid indexing high-cardinality, low-cardinality, or frequently updated columns.
- Prefer denormalized tables or materialized views for production query patterns.
Try this query in UnifySQL
Write, optimize, and collaborate on Cassandra queries with AI assistance.
Start Free