How to export data from cassandra to Json file using Python or other language?

I want to export data from Cassandra to Json file, because Pentaho didn't support my version of Cassandra 3.10

You can simply add json after select to get your results in json format:

cqlsh:cycling> select json name, checkin_id, timestamp from checkin;
 {"name": "BRAND", "checkin_id": "50554d6e-29bb-11e5-b345-feff8194dc9f", "timestamp": "2016-08-28 21:45:10.406Z"}
  {"name": "VOSS", "checkin_id": "50554d6e-29bb-11e5-b345-feff819cdc9f", "timestamp": "2016-08-28 21:44:04.113Z"}
(2 rows)

DataStax now provides the tool called DSBulk that works with both DSE & Cassandra, and it's heavily optimized to load and unload data to/from DSE/Cassandra. It supports output as JSON format as well, like this:

dsbulk unload -k keyspace -t table -url out_dir -c json

More examples of unloading data, could be found in this blog post, that is part of the series of blog posts on DSBulk. For example, you can specify which columns of the table to offload, etc.

I had the same need to export cassandra tables as JSON and built a command line tool for it:

You can use bash redirction to get json file.

cqlsh -e "select JSON * from ${keyspace}.${table}" | awk 'NR>3 {print $0}' | head -n -2 > table.json

  • There is jdbc driver for Cassandra, so Pentaho could treat Cassandra as typical SQL database. We've used Pentaho+Cassandra in one of our projects though I'm not sure about Cassandra's version.
  • But I want json file no just json format, how could I save this json?