Garuda.Data - Apache Phoenix for .NET Developers

On June 20, 2016, Microsoft released a preview of their Microsoft.Phoenix.Client on Nuget.org. This package provides a .NET framework compatible collection of classes to interface with the Apache Phoenix Query Server for Apache HBase. I had been evaluating Phoenix and HBase in the prior weeks and the release of the .NET client library was very interesting to me. It was the only .NET compatible client I was aware of and I immediately began experimenting with it.

As I began learning how to use the Microsoft.Phoenix.Client, I noticed the use of Google Protocol Buffers and discovered the underlying wire protocol relied on Protocol Buffers also. As it turns our Apache Phoenix uses Apache Calcite, and specifically Avatica, at the network layer to facilitate the Java Database Connectivity (JDBC) interface.

The Microsoft.Phoenix.Client’s project site, hosted on GitHub, has some great examples of using the PhoenixClient. I also found duoxo’s tweet-sentiment-phoenix project to have some great examples. As I worked though my first use case, I was missing the good old IDbConnection et al paradigm implemented for so many relational database providers. SqlConnection, OracleConnection, etc. Wouldn’t it be great if there was a PhoenixConnection?

With a resounding “yes!”, I started the PhoenixConnection class, which led to the PhoenixCommand and then PhoenixDataReader classes. Along with these classes came the familiar ConnectionString property on the PhoenixConnection class. After working through most of the interfaces, it became possible for me to open a connection to Apache Phoenix and execute queries from my code in a manner virtually every .NET developer is familiar with:

using (IDbConnection phConn = new PhoenixConnection())
{
    phConn.ConnectionString = cmdLine.ConnectionString;

    phConn.Open();

    using (IDbCommand cmd = phConn.CreateCommand())
    {
        cmd.CommandText = "SELECT * FROM GARUDATEST";
        using (IDataReader reader = cmd.ExecuteReader())
        {
            while(reader.Read())
            {
                for(int i = 0; i < reader.FieldCount; i++)
                {
                    Console.WriteLine(string.Format("{0}: {1}", reader.GetName(i), reader.GetValue(i)));
                }
            }
        }
    }                        
}

It seemed apparent that these APIs would be useful to other .NET developers who needed to interface with big data stored in Apache Phoenix/HBase. I decided to prepare a Nuget package which I named Garuda.Data, after the mythical bird from Hindu and Buddhist traditions, the Garuda, and uploaded an early “alpha” version to nuget.org in late July.

The Garuda.Data package has been updated several times since the original alpha release. As of the time of this writing, it is in beta (v0.5.6067.42547) and includes:

  • PhoenixBulkCopy class which takes advantage of the ExecuteBatchRequest mechanism for more efficient inserts/updates (UPSERTS in Phoenix SQL).
  • PhoenixTransaction class enables traditional transactional commit/rollback using the Phoenix Transactions functionality.
  • Compatibility with .NET DataTable and DataGridView

The Garuda.Data project is part of a solution, GarudaUtil, which includes a graphical user interface for connecting to and querying Apache Phoenix using the Garuda.Data library.

Garuda Query Screenshot


The GarudaUtil solution including Garuda.Data and Garuda Query, is available on GitHub.. I welcome feedback! Please use the GarudaUtil Issues section to report any bugs or enhancements.

Links:

Best,

Daniel (@dwdii)