Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Databases Programming Software IT

MySQL Falcon Storage Engine Open Sourced 235

An anonymous reader writes "The code for the Falcon Storage Engine for MySQL has been released as open source. Jim Starkey, known as the father of Interbase, is behind its creation; previously he was involved with the Firebird SQL database project. Falcon looks to be the long-awaited open source storage engine that may become the primary choice for MySQL, and along the way offer some innovation and performance improvements over current alternatives." This is an alpha release for Windows (32-bit) and Linux (32- and 64-bit) only, and is available only in a specially forked release of MySQL 5.1.
This discussion has been archived. No new comments can be posted.

MySQL Falcon Storage Engine Open Sourced

Comments Filter:
  • Please explain (Score:3, Interesting)

    by Shimmer ( 3036 ) on Tuesday January 02, 2007 @03:57PM (#17435526) Journal
    I'm a developer, but I've never used MySQL.

    Isn't MySQL already open source? If so, how does the Falcon storage engine differ from the "regular" storage engine that comes with MySQL?
    • Re: (Score:3, Funny)

      by Anonymous Coward
      It doesn't suck.
      • by nuzak ( 959558 )
        It doesn't support foreign keys, so sucking is still one of its main feature points.
        • by 0racle ( 667029 )
          Yes it does.
          • Re: (Score:2, Informative)

            by Torst ( 22795 )
            My understanding is that the Falcon engine, like MyISAM and SQLite, supports the SYNTAX for defining foreign keys, but do NOT actually enforce the foreign key constraints! That's quite an important difference. I'll stick to PostgreSQL for when I care about relational integrity.
          • Re: (Score:3, Funny)

            by nuzak ( 959558 )
            You might want to inform MySQL AB and tell them to update their page at http://www.mysql.org/doc/refman/5.1/en/se-falcon-l imits.html [mysql.org] which says "Foreign key support is currently not available." I'm sure they'll be thrilled to hear the news.
        • Re: (Score:3, Insightful)

          by jadavis ( 473492 )
          It doesn't support foreign keys

          Here's what gets me about MySQL. They say they have "pluggable" storage engines, but there's no clean abstraction. Each engine supports some things and not others.

          Pluggable engines might be useful if the only differences are in the implementation, storage requirements, performance, and other administrative aspects. However, a constraint violation that will cause an error in one storage engine passes right through another storage engine. So, it's not like you can just swap one
      • It doesn't suck.

        This message, posted to one of the highest volume sites on the internet, brought to you by InnoDB (one of MySQL's default storage engines).
    • Re:Please explain (Score:5, Informative)

      by Anonymous Coward on Tuesday January 02, 2007 @04:06PM (#17435616)
      MySQL itself is Open Source. But that only gives you a few storage Engines. The specific storage engines have different licenses. It is perfectly possible to have commercial storage engine for MySQL.

      MySQL has no "native" way to store or obtain data - everything goes through plugins, some of which ship with MySQL some don't.

      MyISAM - the most common and fastest. But no transactions, no ACID, etc. Good for many read-only or non critical tables.
      InnoDB - licensed from InnoSoft (now oracle). GPL for non commercial, extra dollars for commercial. Transactions, ACID, but a bit slow. .... other storage engines also exist
      • Re: (Score:2, Informative)

        by Entrope ( 68843 )
        This is probably a FAQ, and wandering off-topic, but exactly how the cheese do you have "GPL for non commercial, extra dollars for commercial" and expect it to work? Even InnoDB's licenses page implies that the limitation is "GPL, extra dollars for non-copyleft".
        • Re:Please explain (Score:4, Insightful)

          by rongage ( 237813 ) on Tuesday January 02, 2007 @05:22PM (#17436446)

          This is easy and one of the tenants of so-called dual licensing setups...

          Basically, if you don't want to pay to use the software, you are bound to the terms of the GPL. If you don't want to be bound to the terms of the GPL, you gotta pay.

          • Re: (Score:3, Informative)

            by Entrope ( 68843 )
            I mentioned a valid dual-licensing offer in my post. The original post said that the GPL version came with a restriction on commercial use, which would be contrary to the terms of the GPL as they are generally understood. Thanks for reading the details before you reply!
            • by stg ( 43177 )
              He probably did read the details. Some people automatically think that because something is GPL you can't have commercial use, just because then who'd pay you when they could get it for free?

              In many situations (you sell a product that uses MySql, for example) it just makes more sense to buy a license instead of abiding by the GPL.

              However, there are a few (obvious) situations where you can use GPL software for profit:
              1) Server-side use - no need to provide source code.
              2) Contractors who are required to give
      • Re: (Score:3, Interesting)

        by kosmosik ( 654958 )
        > GPL for non commercial, extra dollars for commercial.

        Well you can use GPL version for commercial projects and The Other license for totally uncomercial projects.
    • Re:Please explain (Score:5, Informative)

      by SirThomas ( 6833 ) * on Tuesday January 02, 2007 @04:08PM (#17435634) Homepage
      Stolen directly from the mysql website:

      Falcon has been specially developed for systems that are able to support larger memory architectures and multi-threaded or multi-core CPU environments. Most 64-bit architectures are ideal platforms for the Falcon engine, where there is a larger available memory space and 2-, 4- or 8-core CPUs available. It can also be deployed within a standard 32-bit environment.

      The Falcon storage engine is designed to work within high-traffic transactional applications. It supports a number of key features that make this possible:

              * True Multi Version Concurrency Control (MVCC) enables records and tables to be updated without the overhead associated with row-level locking mechanisms. The MVCC implementation virtually eliminates the need to lock tables or rows during the update process.
              * Flexible locking, including flexible locking levels and smart deadlock detection keep data protected and transactions and operations flowing at full speed.
              * Optimized for modern CPUs and environments to support multiple threads allowing multiple transactions and fast transaction handling.
              * Transaction-safe (fully ACID-compliant) and able to handle multiple concurrent transactions.
              * Serial Log provides high performance and recovery capabilities without sacrificing performance.
              * Advanced B-Tree indexes.
              * Data compression stores the information on disk in a compressed format, compressing and decompressing data on the fly. The result is in smaller and more efficient physical data sizes.
              * Intelligent disk management automatically manages disk file size, extensions and space reclamation.
              * Data and index caching provides quick access to data without the requirement to load index data from disk.
              * Implicit savepoints ensure data integrity during transactions.
      • VACUUM? (Score:3, Interesting)

        by dskoll ( 99328 )

        So if Falcon uses MVCC, does it require something like PostgreSQL's VACUUM? Or does it have some other way to detect and remove dead tuples?

        Also, has anyone looked at making PostgreSQL a storage plugin for MySQL? :-)

        • Re:VACUUM? (Score:4, Funny)

          by gavinroy ( 94729 ) on Tuesday January 02, 2007 @05:10PM (#17436324) Homepage
          That's like putting a Volkswagen Bug body on top of a tank ;-)
        • Re: (Score:2, Informative)

          Seeing as it has its base in Interbase, I would be running away as quick as possible. Interbase/Firebird(?) has a SWEEP process (read: Vacuum), however it was far more sluggish than Postgres. Also, the MVCC has a transaction count limit on Interbase where you hit ~2 billion transactions you MUST do a backup/restore -- a simple sweep won't cut it. I was working with a database that eventually required bi-monthly restores and each restore took over 12 hours. I also always found it funny that Interbase's row
          • From what I know, the MVCC is in-memory only. I've followed the happenings at Firebird and Netfrastructure (Jim Starkey's previous company, from which Falcon is derived), and he has commented that the MVCC is handled in-memory. Disk-based records are always a single copy with no concept of back-records.

            The backup and restore issue in Firebird hopefully will be worked on soon, as I agree it is a big problem with high transaction rate applications. This is completely different than the sweep/vacuum issue, whi
      • by killjoe ( 766577 )
        Does it have unclustered primary keys? That's one thing that sucks with innodb and makes it virtually impossible to use GUIDs as primary keys.
        • I certainly understand the benefits of using GUID/UUIDs for primary keys, I often need to generate PKs/tags/identifiers in my real time system without doing a blocking INSERT and read the auto-increment value.

          However, as a user of InnoDB and its clustered indexes, I've found that you can easily use a UNIQUE index for the table and either let InnoDB create the "unseen" auto-increment INT PK, or just create the auto-increment INT PK openly, but use the UNIQUE GUID key for your identifiers. The performance is
          • by killjoe ( 766577 )
            That's the usual workaround for this predicament but it tends to create headaches with some ORMs. Anyway the usual purpose of using guids is to aid replication collusions and adding an autoincrement field complicates that (yes I know mysql has a workaround for that too!).

            Too bad postgres doesn't come with a GUID type. You have to track down some guys patch.

            Crazy huh? How both of the most popular open source databases are lacking some really basic features like controlling the clustering of your indexes, g
    • mysql is avaliable via the gpl. however, there are different backends that you can use within mysql. the default current one is kinda lame. there is another one that is legally owned by someone not mysql (oracle now, I think). in theory this falcon engine is super awesome, free (as in tinkle), and isn't encumbered with ownership issues.
    • by 0racle ( 667029 )
      MySQL can use many different storage engines, some open, some not. Having another Open engine is just a good thing to have.
    • Re: (Score:3, Informative)

      This is a real quick, detailess explanation, but explanation nonetheless... Essentially there are several options for "storage engine", each with its own set of features... The vanilla engine w/ MySQL is MyISAM, which among other things doesn't support transactions. In lieu of these shortcomings there is also the InnoDB engine, which does do transactions, etc.

      The Falcon engine is from a renowned database developer, and as such has all sorts of neat features [mysql.org].
    • Re: (Score:3, Insightful)

      by NineNine ( 235196 )
      Apparently, MySQL is a mish-mash of all kinds of different code. Heck, even the part that handles esoteric stuff such as stored procedures (note tongue in cheek) is part of some external module. I can't imagine that having multiple layers of API's is a good thing, considering how critical databases can be. Personally, I want my database to be consistent, and thoroughly tested, but the best being when the whole thing is designed to work together. Call me crazy, but it certainly does not give me a warm fu
  • by atomic777 ( 860023 ) on Tuesday January 02, 2007 @04:04PM (#17435596)
    I've been very excited since I first heard about this new storage engine adapted from Netfrastructure. Not only does it give MySQL a transactional storage engine that is not controlled by a hostile company, but the engine appears to be designed from the bottom up to support web traffic. Jim gave a great talk at the Boston MySQL meetup that you can watch here http://video.google.com/videoplay?docid=1929002440 950908895 [google.com]
  • What does Falcon do that the InnoDB engine doesn't currently do? The description seems rather vague.
    • Re: (Score:3, Insightful)

      by Just Some Guy ( 3352 )
      What does Falcon do that the InnoDB engine doesn't currently do?

      Get developed by a company that doesn't hate MySQL, for starters.

    • Re: (Score:3, Insightful)

      by namityadav ( 989838 )
      Like others said already .. First of all, Falcon allows MySQL to not be as dependent on InnoDB as it is right now. When Oracle got hold of InnoDB, people had serious doubts about MySQL's future. For a solution like MySQL, it's always good to have competing (Even if similar) engines for it's own survival / growth.
    • by DrJimbo ( 594231 ) on Tuesday January 02, 2007 @06:09PM (#17436980)
      The big thing Falcon brings is MVCC which allows safe simultaneous reading and and writing without locks.

      Here is a good explanation of PostgreSQL's MVCC [onlamp.com].

      • Re: (Score:2, Informative)

        by nuzak ( 959558 )
        Actually, InnoDB uses MVCC too. Hell, even BDB uses MVCC.

        Getting something optimized for read loads that scales in size better than MyISAM seems to be the primary motivation for Falcon.
      • Modded Informative? InnoDB is totally an MVCC database. It has 100% non-locking SELECTs. Come on.
  • I just setup a phpBB site that uses MySQL and I'm always keeping an eye on storage space and performance now. I don't suppose my host will upgrade MySQL any time soon, but this is still cool news...
    • Since the article and the product's website all state that this is an ALPHA, you likely will be quite happy that your host will not upgrade any time soon.
  • OK, although I don't agree with it, I know that a lot of people use MySQL because it's the most common database supported by web hosts. Isn't it almost as likely for a hosting company to have PostgreSQL, though, as to upgrade to a bleeding-edge version of MySQL when this is finally readied for public consumption? Will this new backend give any extra functionality that PostgreSQL doesn't offer?
    • by pembo13 ( 770295 )
      So you're suggesting that people use something that they are not comfortable with, instead of something that they are and which is getting better at an acceptable pace?
      • So you're suggesting that people use something that they are not comfortable with, instead of something that they are and which is getting better at an acceptable pace?

        Yes. I am indeed suggesting that people spend a few minutes learning how another Free, faster, and more powerful database works. If you're dabbling, then it's no harder to learn one than the other. If you're serious, then you owe it to yourself and your clients to know what's available.

      • So you're suggesting that people use something that they are not comfortable with, instead of something that they are and which is getting better at an acceptable pace?
        --
        As a boy I jumped through Windows, as a man I play with Penguins.
        Your advice appears to be at odds with your sig.
         
    • by unoengborg ( 209251 ) on Tuesday January 02, 2007 @09:19PM (#17438776) Homepage
      As a I see Falcon is mainly an attempt to:

      1) Get better performance on multiprocessor systems

      2) Get a decent storage engine that is not controlled by MySQL competitors

      As far as I can tell there is nothing in it that you can't get in Postgresql.
      Postgresql already performs better than the standard MySQL on multiprocessor systems.
      It remains to be seen if Falcon will be better than Postgresql once its production
      ready. Well, there is one thing, Falcon compresses data while Postgresql doesn't. can't help wondering what this will do to performance.

      On the other hand there seam to be a lot missing from Falcon that you find in Postgresql.
      If you read the Falcon limits page on the mysql site you find that it lacks e.g:

      - SELECT FOR UPDATE

      - No online backup

      - No foreign keys

      All in all, I would say Postgresql would be a better choice, if your web hosting company allows you to use it.

      MySQL have a tendency to slow down on many concurrent or complex queries. Postgresql is far better at handling triggers and can be programmed in many different programming languages. Support for domains and much more. MySQL also lacks EXCEPT, this makes some types of queries (relational division) much more complex than they have to be.

      Still for people that aren't free to choose their database, it is nice to see that MySQL
      makes some progress. Besides a little competition never hurts.
      • by sumbry ( 644145 )
        > Well, there is one thing, Falcon compresses data while Postgresql doesn't. can't help wondering what this will do to performance.

        This is actually not true. If you have a variable length column in Postgres and insert data that exceeds 2K in size, it will automatically be compressed by Postgres using a simple LZ compression algorithm. This has been the case since Postgres 8.0. You have to specifically disable this feature if you don't want to use it.

        http://www.postgresql.org/docs/8.0/static/storage- [postgresql.org]

  • for smaller databases, but limiting the tablespace to a single file per database/schema doesn't sound very flexible, and won't allow DBAs to maximize their disk throughput.

    I am guessing that this is more of a MyISAM replacement than an InnoDB replacement, so it's not really a shot across Oracle's bow (as some comments make it sound like).
    • Re: (Score:3, Informative)

      by TopSpin ( 753 ) *
      for smaller databases, but limiting the tablespace to a single file per database/schema doesn't sound very flexible, and won't allow DBAs to maximize their disk throughput.

      That isn't necessarily the case. Recently it has become popular to aggregate spindles into single stripe/mirror volumes with large stripe widths. This spreads I/O operations uniformly across disks. All disks contribute their IOPS capacity to all operations. Large stripe widths attempt to leverage high sequential IO bandwidth.

      Oracle ca
  • I had to drop MySQL (Score:2, Informative)

    by caluml ( 551744 )
    I was forced to learn Postgres because I wanted to do a SELECT * FROM foo WHERE bar NOT IN ( SELECT x FROM y WHERE z > 4 ) type query - basically a negative subselect. I couldn't do (or couldn't work out how to) use table JOINs to perform the same thing, and at the time, MySQL didn't do it.

    I'm so thankful. Postgres is so much better. And the table JOIns I did have ran so much faster in PG. s/mysql_/pg_/g *.php
    • Re: (Score:3, Informative)

      by RobKow ( 1787 )
      Well, for your particular example:

      SELECT * FROM foo LEFT OUTER JOIN y ON foo.bar = y.x AND y.z > 4 WHERE y.x IS NULL ;)

      But I'm sure there are nasty cases where you can't substitute joins readably or maybe at all.
      • by XO ( 250276 )
        I've never had to deal with anything more than a personal/small business database, and therefore am certainly no expert in database queries, but there have been many cases where I've been unable to figure out just what the hell I need to do to get the results i'm looking for straight from MySQL (or Access, which is unfortunatly what the small business database uses, and I am working on converting that sometime soon) ...
        so, I just end up requesting a result set, get that, do some operations on it in PHP, the
  • Oh, please (Score:4, Informative)

    by Lisandro ( 799651 ) on Tuesday January 02, 2007 @05:17PM (#17436392)
    Falcon looks to be the long-awaited open source storage engine that may become the primary choice for MySQL, and along the way offer some innovation and performance improvements over current alternatives.

    Come on. Give me PostgreSQL [postgresql.org] any day. After fiddling with MySQL at work for a few custom developments (both versions 4 and 5.0.2) i'm ready for anything else. Gave PSGSQL a shot at home and it runs very nice, with lots of advanced features. Anyone with real-world deploying feedback to share?
    • Re: (Score:3, Informative)

      by killjoe ( 766577 )
      Tried it. Found out there was no support for case insensitive collations and then dropped it. Every other database in the world supports case insensitive collations so now I am testing out firebird, mysql and db/2. All are OK, I probably don't need everything db/2 offers so I will most likely end up with one of the open sourced ones.
  • when i first read falcon storage engine, i initially thought of a storage software called falconstor [falconstor.com]. falconstor is more in the data backup/archive/recovery market, often used as a disk-to-disk-to-tape solution, or sometimes as a virtual tape device, but i imagine that often times a name that's close but different is almost always considered "too close" and forces a name change by somebody... let's see how this gets resolved. who knows, maybe the other falcon store [falconstudios.com](NSFW) might be the winner...
  • The summary does not mention that the alpha only supports x86 platforms... ie, on Linux, it may be compiled for 80486 and better 32 bit IA32 compatible microprocessors, and Intel EM32T compatible 64bit microprocessors such as the AMD's Athlon 64 family and of course the newer non-Itaniam 64bit Intel microprocessors.
    Support for other families of microprocessors, such as PowerPC and Sparc will come later.
  • by dybdahl ( 80720 ) <(kd.lhadbyd) (ta) (ofni)> on Wednesday January 03, 2007 @03:40AM (#17441488) Homepage Journal
    I've read through all comments with 2 or more in rating, and it seems that people really underestimate what Jim is doing here.

    We're talking in-memory MVCC here. This means you can add 1000 records, do a rollback, and the harddisk hasn't been accessed. Even if you commit, performance will eventually be magnificent compared with on-disk MVCC systems. You can run larger systems on one server with this, than you would be able to run on a cluster with other database systems.

    This system has been designed to provide very good performance improvements for those who do know how to create SQL statements, but probably even better performance improvements for those who don't. And we don't have a tradeoff between performance and transactions any more - transactions and better performance are both included.

    Also, please note that this technology will make MySQL a trustworthy data storage for many commercial applications out there, giving added value to their apps and their businesses. It will also enable small but very skilled development teams able to use MySQL as a trustworthy database for specialized applications - previously only Firebird and Postgresql were able to provide this for free, and even though Firebird has a very high deployment in USA's top 500 companies, postgresql seems to be very much *nix only in deployment statistics.

    I have been programming database applications for more than 20 years, and have been programming Oracle, MSSQL, MySQL, postgresql, Firebird, dBase, Paradox, Access and other databases. I see Jim's contributions to MySQL as extremely important for the database market. Instead of having "just" a transaction layer on top of a storage layer, MySQL now provides mechanisms that give this design an advantage over those database systems where the transactions are stored on disk (like Firebird, Postgresql).

    And - by the way - this has NOTHING to do with "optimizing for web applications". Web applications are just as diverse as GUI applications and other systems, and GUI applications will benefit from this as much as web applications.
    • A couple of questions you may be able to answer - if MVCC is being done in memory, and disk writes being delayed as long as possible, then I stand to lose an awful lot of data if my master DB server goes down dont't I? My slave will be well behind, and even if I can recover the master, then a lot of data wont have been saved to disk and the log checkpoints will be just as stale. Or is this some compromise, where a commit triggers a write to disk? If so, then I still don't get what's so great, as in all the

"The following is not for the weak of heart or Fundamentalists." -- Dave Barry

Working...