Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Databases Data Storage Open Source Upgrades IT Apache

Cassandra 0.7 Can Pack 2 Billion Columns Into a Row 235

angry tapir writes "The cadre of volunteer developers behind the Cassandra distributed database have released the latest version of their open source software, able to hold up to 2 billion columns per row. The newly installed Large Row Support feature of Cassandra version 0.7 allows the database to hold up to 2 billion columns per row. Previous versions had no set upper limit, though the maximum amount of material that could be held in a single row was approximately 2GB. This upper limit has been eliminated."
This discussion has been archived. No new comments can be posted.

Cassandra 0.7 Can Pack 2 Billion Columns Into a Row

Comments Filter:
  • by loufoque ( 1400831 ) on Sunday January 16, 2011 @09:04PM (#34900852)

    ... then you're doing it wrong

  • Why? (Score:4, Insightful)

    by Xoc-S ( 645831 ) on Sunday January 16, 2011 @09:08PM (#34900870)
    Only a completely de-normalized flat-file database would need anything like that number of columns. That would mean many duplicate pieces of information, and a complete maintenance nightmare. The only purpose I can see is to have views of existing normalized data for fast searching, but that would be read-only data.

    This is a feature in need of an application and I can see very few applications.

  • by Giant Electronic Bra ( 1229876 ) on Sunday January 16, 2011 @11:35PM (#34901606)
    That we had all of this stuff 30 years ago. It was called 'network' databases, which were pretty much the standard sort of technology before RDBMS came along and everyone realized how incredibly much better relational algebra was for the vast majority of problems. As with many other things older ideas eventually resurface with new names and a few more features. There are times when this kind of facility is useful. Nothing wrong with it. The vast majority of cases though where I've seen people using something like Cassandra or Big Table were ill advised. A properly optimized RDBMS with correctly designed schema can handle all but a few edge cases. Most of the hype these tools are generating is based on a lack of real understanding of how to properly use databases combined with people believing myths about other technologies and helped along by the industry's short memory span. The best part though is that when something turns into a giant mess guys like me can make nice money fixing the mess. lol.
  • by Anonymous Coward on Monday January 17, 2011 @03:54AM (#34902542)

    - Not everyone answers every question. There is skip logic involved and there are loops, sometimes nested. These would lend themselves well for a multi table relational approach but the data does not come out of the data collection systems like that (most of them anyway). Would be nice to normalize, but as mentioned, there are new datasets every week, most of them having 1000s of columns. Good luck with normalizing all that before your deadline.

    - Normalized data is not as easy to use in statistical applications. SPSS, the 800 pound gorilla in stats land, only supports flat data, for example.

    - There are things called multiple response questions, sometimes having 100s of options, sometimes 1000s. Ergo 100s to 1000s of columns per question. "Which car models have you ever owned" + every single car model produced in the last 40 years is a good example. Of course there are alternatives such as blob fields and bit shifting, or storing only max 20 answers (first car, second car, etc) but it costs time to convert them. And these formats are also harder to use in statistical analysis, even in flat data.

    In a world where you have complete control over the provided input, and the required output, you are right. In the real world, not so much.

  • by AlXtreme ( 223728 ) on Monday January 17, 2011 @05:59AM (#34902936) Homepage Journal

    Dear $DEITY, the number of times I've seen (mostly) PHP crapplications use CREATE DATABASE and CREATE / ALTER TABLE, often with ingenious naming schemes, instead of simply inserting new rows. Certain people shouldn't be allowed to touch databases.

    If anyone needs me I'll be sobbing over my coffee.

An Ada exception is when a routine gets in trouble and says 'Beam me up, Scotty'.

Working...