Thursday, January 23, 2014

C# Driver for Apache Cassandra Remote Authentication


Apache Cassandra is an open source distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure.

They have many APIs and Drivers available at the following URL http://www.datastax.com/download/clientdrivers 

One of those drivers is Cassandra-sharp , the philosophy of cassandra-sharp is to be really simple and fast: no Linq provider, no complex API. Just CQL, simple object mapping and great performance.

One of the requirements I had for our cloud environment was being able to create user accounts and assign privileges on KEYSPACES.

I came up with a simple class and everything worked just fine on my local machine, however whenever we attempted to do remote authentication we got this error message:


_cluster= Cluster.Builder().WithCredentials("user","pass").WithPort(port).AddContactPoint(node).Build();

Cassandra.AuthenticationException: Unsupported Authenticator org.apache.cassandra.auth.PasswordAuthenticator

We opened a case with DATASTAX and they mentioned the C# modified version for DSE 3.1/3.2 authentication wasn't available yet but might be ready by the end of January 2014.

I'll will share with you once the updated driver is provided if this solves the Remote Authentication issue.

UPDATE: This has been fixed with the latest version of Cassandra C# driver.



Tuesday, January 21, 2014

Statistics IO Parser

One of the best ways to start tuning your queries is by looking at the amount of disk activity generated by your Transact-SQL statements.

The way you do this is by setting up STATISTICS IO ON at run time like this.

SET STATISTICS IO ON

The output will display the following information.



The output of the following query would look something like:

Table 'PurchaseOrderDetail'. Scan count 1, logical reads 66, physical reads 0, read-ahead reads 64, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'PurchaseOrderHeader'. Scan count 1, logical reads 44, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


SET STATISTICS IO ON
select * from [Purchasing].[PurchaseOrderHeader] a
join [Purchasing].[PurchaseOrderDetail] b
on a.PurchaseOrderID = b.PurchaseOrderID


where OrderDate >= '2004-05-17 00:00:00.000'

This is cool but what if you want to see the Total I/O generated by the query and see the output better formatted?  Today I found this website built by Richie Rump http://statisticsioparser.com/ which formats the output of the IO Statistics.

Just enter the output you got from turning STATISTICS IO ON and click on Parse button



You will get a nice formatted output including the IO Totals