Friday, 2 August 2019

CQL data types

CQL data types

CQL Type
Constants supported
Description
ascii
strings
US-ASCII character string
bigint
integers
64-bit signed long
blob
blobs
Arbitrary bytes (no validation), expressed as hexadecimal
boolean
booleans
true or false
counter
integers
Distributed counter value (64-bit long)
date
strings
Value is a date with no corresponding time value; Cassandra encodes date as a 32-bit integer representing days since epoch (January 1, 1970). Dates can be represented in queries and inserts as a string, such as 2015-05-03 (yyyy-mm-dd)
decimal
integers, floats
Variable-precision decimal
Java type
Note: When dealing with currency, it is a best practice to have a currency class that serializes to and from an int or use the Decimal form.
double
integers, floats
64-bit IEEE-754 floating point
Java type
float
integers, floats
32-bit IEEE-754 floating point
Java type
frozen
user-defined types, collections, tuples
A frozen value serializes multiple components into a single value. Non-frozen types allow updates to individual fields. Cassandra treats the value of a frozen type as a blob. The entire value must be overwritten.
Note: Cassandra no longer requires the use of frozen for tuples:
frozen <tuple <int, tuple<text, double>>>
inet
strings
IP address string in IPv4 or IPv6 format, used by the python-cql driver and CQL native protocols
int
integers
32-bit signed integer
list
n/a
A collection of one or more ordered elements: [literal, literal, literal].
CAUTION:
Lists have limitations and specific performance considerations. Use a frozen list to decrease impact. In general, use a set instead of list.
map
n/a
A JSON-style array of literals: { literal : literal, literal : literal ... }
set
n/a
A collection of one or more elements: { literal, literal, literal }
smallint
integers
2 byte integer
text
strings
UTF-8 encoded string
time
strings
Value is encoded as a 64-bit signed integer representing the number of nanoseconds since midnight. Values can be represented as strings, such as 13:30:54.234.
timestamp
integers, strings
Date and time with millisecond precision, encoded as 8 bytes since epoch. Can be represented as a string, such as 2015-05-03 13:30:54.234.
timeuuid
uuids
Version 1 UUID only
tinyint
integers
1 byte integer
tuple
n/a
A group of 2-3 fields.
uuid
uuids
A UUID in standard UUID format
varchar
strings
UTF-8 encoded string
varint
integers
Arbitrary-precision integer
Java type

Blob type
The Cassandra blob data type represents a constant hexadecimal number defined as 0[xX](hex)+ where hex is a hexadecimal character, such as [0-9a-fA-F]. For example, 0xcafe. The maximum theoretical size for a blob is 2 GB. The practical limit on blob size, however, is less than 1 MB. A blob type is suitable for storing a small image or short string.
Blob conversion functions 
These functions convert the native types into binary data (blob):
typeAsBlob(value)
blobAsType(value)
For every native, nonblob data type supported by CQL, the typeAsBlob function takes a argument of that data type and returns it as a blob. Conversely, the blobAsType function takes a 64-bit blob argument and converts it to a value of the specified data type, if possible.
This example shows how to use bigintAsBlob:
CREATE TABLE bios ( user_name varchar PRIMARY KEY,    bio blob );
INSERT INTO bios (user_name, bio) VALUES ('fred', bigintAsBlob(3));
SELECT * FROM bios;
user_name | bio
---------------+--------------------
fred            | 0x0000000000000003

Collection type
A collection column is declared using the collection type, followed by another type, such as int or text, in angle brackets. For example, you can create a table having a list of textual elements, a list of integers, or a list of some other element types.
list<text>
list<int>
Collection types cannot be nested, but frozen collection types can be nested inside frozen or non-frozen collections. For example, you may define a list within a list, provided the inner list is frozen:
list<frozen <list<int>>>
Indexes may be created on a collection column of any type.
Using frozen in a collection 
A frozen value serializes multiple components into a single value. Non-frozen types allow updates to individual fields. Cassandra treats the value of a frozen type as a blob. The entire value must be overwritten.
column_name collection_type<data_type, frozen<column_name>>
For example:
CREATE TABLE mykeyspace.users (
  id uuid PRIMARY KEY,
  name frozen <fullname>,
  direct_reports set<frozen <fullname>>,     // a collection set
  addresses map<text, frozen <address>>     // a collection map
  score set<frozen <set<int>>>              // a set with a nested frozen set
);

UUID and timeuuid types
The UUID (universally unique id) comparator type is used to avoid collisions in column names. Alternatively, you can use the timeuuid.
Timeuuid types can be entered as integers for CQL input. A value of the timeuuid type is a Type 1 UUID. A Version 1 UUID includes the time of its generation and are sorted by timestamp, making them ideal for use in applications requiring conflict-free timestamps. For example, you can use this type to identify a column (such as a blog entry) by its timestamp and allow multiple clients to write to the same partition key simultaneously. Collisions that would potentially overwrite data that was not intended to be overwritten cannot occur.
UUID and timeuuid types

Using a counter
To load data into a counter column, or to increase or decrease the value of the counter, use the UPDATE command. Cassandra rejects USING TIMESTAMP or USING TTL in the command to update a counter column.
Procedure
Create a table for the counter column.
cqlsh> USE cycling;
cqlsh> CREATE TABLE popular_count (id UUID PRIMARY KEY,  popularity counter);
Loading data into a counter column is different than other tables. The data is updated rather than inserted.
UPDATE cycling.popular_count SET popularity = popularity + 1  WHERE id = 6ab09bec-e68e-48d9-a5f8-97e6fb4c9b47;
Take a look at the counter value and note that popularity has a value of 1.
SELECT * FROM cycling.popular_count;

Additional increments or decrements will change the value of the counter column.

UUID and timeuuid functions
The uuid() function takes no parameters and generates a random Type 4 UUID suitable for use in INSERT or UPDATE statements.
Several timeuuid() functions are designed for use with the timeuuid() type:
dateOf()
Used in a SELECT clause, this function extracts the timestamp of a timeuuid column in a result set. This function returns the extracted timestamp as a date. Use unixTimestampOf() to get a raw timestamp.
now()
In the coordinator node, generates a new unique timeuuid in milliseconds when the statement is executed. The timestamp portion of the timeuuid conforms to the UTC (Universal Time) standard. This method is useful for inserting values. The value returned by now() is guaranteed to be unique.
minTimeuuid() and maxTimeuuid()
Returns a UUID-like result given a conditional time component as an argument. For example:
SELECT * FROM myTable WHERE t > maxTimeuuid('2013-01-01 00:05+0000') AND t < minTimeuuid('2013-02-02 10:00+0000')
The min/maxTimeuuid example selects all rows where the timeuuid column, t, is strictly later than 2013-01-01 00:05+0000 but strictly earlier than 2013-02-02 10:00+0000. The t >= maxTimeuuid('2013-01-01 00:05+0000') does not select a timeuuid generated exactly at 2013-01-01 00:05+0000 and is essentially equivalent to t > maxTimeuuid('2013-01-01 00:05+0000').
The values returned by minTimeuuid and maxTimeuuid functions are not true UUIDs in that the values do not conform to the Time-Based UUID generation process specified by the RFC 4122. The results of these functions are deterministic, unlike the now() function.
unixTimestampOf()
Used in a SELECT clause, this functions extracts the timestamp in milliseconds of a timeuuid column in a result set. Returns the value as a raw, 64-bit integer timestamp.
Cassandra 2.2 and later support some additional timeuuid and timestamp functions to manipulate dates. The functions can be used in INSERT, UPDATE, and SELECT statements.
toDate(timeuuid)
Converts timeuuid to date in YYYY-MM-DD format.
toTimestamp(timeuuid)
Converts timeuuid to timestamp format.
toUnixTimestamp(timeuuid)
Converts timeuuid to UNIX timestamp format.
toDate(timestamp)
Converts timestamp to date in YYYY-MM-DD format.
toUnixTimestamp(timestamp)
Converts timestamp to UNIX timestamp format.
toTimestamp(date)
Converts date to timestamp format.
toUnixTimestamp(date)
Converts date to UNIX timestamp format.
An example of the new functions creates a table and inserts various time-related values:
CREATE TABLE sample_times (a int, b timestamp, c timeuuid, d bigint, PRIMARY KEY (a,b,c,d));
INSERT INTO sample_times (a,b,c,d) VALUES (1, toUnixTimestamp(now()), 50554d6e-29bb-11e5-b345-feff819cdc9f, toTimestamp(now()));
Select data and convert it to a new format:

SELECT toDate(c) FROM sample_times;

No comments:

Post a Comment

Architecture of Cassandra

Architecture of Cassandra A Cassandra instance is a collection of independent nodes that are configured together into a cluster. In a C...