The Hypertable Blog
New Cluster Administration Tool - ht_cluster
As of the 0.9.8.4 release we’ve introduced a new cluster administration tool called ht_cluster. The tool provides a simple way to perform common administrative tasks such as starting and stopping Hypertable and upgrading to a new version. This tool replaces Capistrano, which was the recommended cluster administration tool that served us well for many years. ht_cluster is based on the same role and task concepts as Capistrano, where roles are named sets of hosts (e.g. master, slave) and tasks are shell scripts associated with roles that run in parallel on all hosts in the associated roles. While conceptually very similar, ht_cluster improves upon Capistrano in a number of ways, most notably usability. The following list describes the ways in which ht_cluster improves upon Capistrano.
- No Dependencies – No dependency on Ruby or any Ruby gems. All of the dependencies are included in the Hypertable package, so ht_cluster works “right out of the box”.
- Single Language – The cluster configuration file is written entirely in Bash, whereas with Capistrano the language is a mixture of Ruby and Bash, making it difficult to mentally jump back-and-forth between languages.
- Maximum Performance – ht_cluster is implemented with libssh 0.6.3 and connection establishment and command execution happens asynchronously and in parallel, which provides for maximum parallelism and performance.
- Host Specification Pattern – Hosts can be described using a concise and flexible host specification pattern. Several hundred hosts can be specified with a pattern such as test[000-299]. Hostnames can also be subtracted from a pattern which allows dead hosts to be easily removed from the configuration, for example: test[000-299] - test137. Capistrano, on the other hand, requires each host name to be listed individually.
- Stable Output – ht_cluster buffers command output and writes the complete output for each host in order of host name. If the command output is stable, then the output of ht_cluster is stable, making it useful as a testing tool. Capistrano, on the other hand, interleaves the output of the commands as then run in parallel on each host, making the output unstable.
- Upgradability – The deployment-specific cluster configuration (roles) and the common cluster configuration (tasks) are stored in separate files. This allows the tasks that are common to all deployments to be upgraded independently and automatically. With Capistrano, upgrading common task definitions requires manual cutting and pasting of the old configuration file, along with the task definitions from the new one, in order to create an updated configuration file.
- Random Start Delay – ht_cluster provides a random-start-delay option to allow tasks to be started on each host after a random delay. This feature can be used to avoid thundering herd problems that can occur when a service is started on a large number of hosts simultaneously.
- Command History – ht_cluster uses the history feature of libedit, the command-line editing library that it uses to read interactive commands. Through the use of this feature, a historical record is kept of recent commands that can be navigated with the arrow keys (or Emacs-style navigation keys). The command history is stored in the file ~/.cluster_history, making the command history available across invocations of the tool.
- Task Arguments – Task definitions in ht_cluster can accept arguments. Since tasks compile down into Bash functions, it’s easy to write tasks that accept and manipulate any number of arguments.
Run Task on Any Host – By default, a task will run on the hosts defined by the roles with which it is associated. However, ht_cluster also allows tasks to be run on an arbitrary set of hosts by providing an on <hostspec> argument to the task. For example, this allows new machines to be added to a cluster and bootstrapped with a single command:
ht cluster start_slaves on test37,test41
Posted By: Doug Judd
New and Improved Secondary Indexes
As part of our ongoing effort to make Hypertable useful for a wide range of applications, we've greatly improved our support for secondary indexing. These improvements, introduced in the 0.9.8.0 release, enable much more powerful queries that access rows by means other than the primary key. As part of this effort, we've changed the query language to allow for more expressive queries and have brought it more in-line with SQL.
Like the original implementation, thew new secondary indexes are fully consistent and massively scalable. They can be used to index not only column values, but also column qualifiers. Secondary indexes can be defined in the CREATE TABLE statement with the INDEX and QUALIFIER INDEX clauses, for example:
CREATE TABLE customers (name, INDEX name, tag, QUALIFIER INDEX tag, info, INDEX info, QUALIFIER INDEX info );
The ALTER TABLE command has been enhanced to allow for creating and dropping indexes of preexisting tables. For example:ALTER TABLE customers ADD ( QUALIFIER INDEX name ) DROP INDEX ( info );
To populate indexes added with ALTER TABLE or to convert indexes created with older versions of Hypertable into the new format, the indexes need to be rebuilt. This can be accomplished with the new REBUILD INDICES command. For example:REBUILD INDICES customers;
The examples which follow in this post assume that the following table has been created and populated with products.tsv.CREATE TABLE products ( title, section, info, category, INDEX section, INDEX info, QUALIFIER INDEX info, QUALIFIER INDEX category );
A set of scripts and data files for running all of the examples in this post can be found in the archive secondary-indexes.tgz.Queries against the value index
One important difference between secondary index queries...
Testing of Hypertable RangeServer Failover
As we mentioned in our previous post announcing the newly arrived RangeServer failover feature, robustness and application transparency were #1 priorities. To achieve these objectives, we placed enormous emphasis on testing. While this testing effort was painstaking and led to a very long development cycle, it has paid enormous dividends in quality and robustness. The following table lists the regression tests that were added to Hypertable to guarantee correctness and ensure that the expected behavior holds true for future releases.# Test Description 1 RangeServer-failover-basic-1 Start Hypertable with two RangeServers (rs1,rs2) and then 1) Load table with data, 2) Kill rs1 and wait for recovery to complete, 3) Stop Hypertable and then restart with just rs2. Dump keys after each step and verify that the table contains the exact set of keys that were loaded. Verify that no ranges are assigned to rs1. 2 RangeServer-failover-basic-2 Start Hypertable with three RangeServers (rs1,rs2,rs3) and then 1) Load table with data, 2) Kill rs1 and wait for recovery to complete, 3) Stop Hypertable and then restart with just rs2 and rs3. Dump keys after each of these three steps and verify that the table contains the exact set of keys that were loaded. 3 RangeServer-failover-basic-3 Start Hypertable with five RangeServers (rs1,rs2,rs3,rs4,rs5) and then 1) Load table with data, 2) Kill rs1 and rs2 and wait for recovery to complete for both servers, 3) Stop Hypertable and then restart with just rs3, rs4, and rs5. Dump keys after each of the three steps and verify that the table contains the exact set of keys that were loaded. 4 RangeServer-failover-basic-4 Start Hypertable with two RangeServers (rs1,rs2) and then load table with data. Kill rs1 and wait for recovery to...
Hypertable has Reached a Major Milestone!
With the release of Hypertable version 0.9.7.0 comes support for automatic RangeServer failover. Hypertable will now detect when a RangeServer has failed, logically remove it from the system, and automatically re-assign the ranges that it was managing to other RangeServers. This represents a major milestone for Hypertable and alows for very large scale deployments. We have been activly working on this feature, full-time, for 1 1/2 years. To give you an idea of the magnitude of the change, here are the commit statistics:
- 441 changed files
- 17,522 line additions
- 6,384 line deletions
The reason that this feature has been a long time in the making is because we placed a very high standard of quality for this feature so that under no circumstance, a RangeServer failure would lead to consistency problems or data loss. We're confident that we've achieved 100% correctness under every conceivable circumstance. The two primary goals for the feature, robustness and applicaiton transparancy, are described below.Robustness
We designed the RangeServer failover feature to be extremely robust. RangeServers can fail in any state (mid-split, transferring, etc.) and will be recovered properly. The system can also withstand the loss of any RangeServer, even the ones holding the ROOT or other METADATA ranges. To achieve this level of robustness, we added 63 regression tests that verify the correct handling of RangeServer failures in every conceivable failure scenario. We will follow up later with a blog post describing these tests.Application Transparency
Another important aspect of our RangeServer failover implementation is application transparency. Aside from a transient delay in database access, RangeServer failures are...
Roadmap to Hypertable 1.0
With the release of Hypertable version 0.9.6.0 I thought I would take some time to describe where we are in terms of the Hypertable 1.0 release and what work is remaining. We had intended to make the next Hypertable release our beta release. However, it’s been four months since the release of 0.9.5.6 and since the beta release is not quite ready to go, we decided to do one last alpha release and call it 0.9.6.0. In this release we’ve put in a considerable effort to fix a number of stability issues that have affected prior releases.0.9.6.0 Stability Improvements for HDFS deployments
The biggest source of instability for Hypertable deployments running on top of HDFS has do with the unclean shutdown of either the Master or RangeServer. Upon restart after this situation has ocurred, the RangeServer (or Master) can fail to come up with an error message similar to the following in its log file:1342810317 ERROR Hypertable.RangeServer : verify_backup (/root/src/hypertable/src/cc/Hypertable/Lib/MetaLogReader.cc:131): MetaLog file '/hypertable/servers/rs12/log/rsml/0' has length 0 < backup file '/opt/hypertable/0.9.5.6 /run/log_backup/rsml/rs12/0' length 11376
This problem was due to a misunderstanding on our part of the HDFS API semantics. Whenever the Master or RangeServer writes data to any of its log files, it makes a call to FSDataOutputStream.sync() to ensure that the data makes it in to the filesystem and is persistent. However, after making this call, a call to the FileStatus.getLen() does not return the correct value. FileStatus.getLen() only returns the correct file length if the file was properly closed. HDFS provides an alternate API, DFSClient.DFSDataInputStream.getVisibleLength(), that returns the actual length of the file regardless...
Secondary Indices Have Arrived!
Until now, SELECT queries in Hypertable had to include a row key, row prefix or row interval specification in order to be fast. Searching for rows by specifying a cell value or a column qualifier involved a full table scan which resulted in poor performance and scaled badly because queries took longer as the dataset grew. With 0.9.5.6, we’ve implemented secondary indices that will make such SELECT queries lightning fast!
Hypertable supports two kinds of indices: a cell value index and a column qualifier index. This blog post explains what they are, how they work and how to use them.The cell value index
Let’s look at an example of how to create those two indices. A big telco asks us to design a table for its customer data. Every user profile has a customer ID as the row key. But our system also wants to provide fast queries by phone number, since customers can dial in and our automated answering system can then immediately figure out who’s calling by checking the caller ID. We therefore decide to create a secondary index on the phone number. The following statement might be used to create this table and along with a phone number index:CREATE TABLE customers ( name, address, phone_no, INDEX phone_no );
Internally, Hypertable will now create a table customers and an index table ^customers. Every cell that is now inserted into the phone_no column family will be transformed and inserted into the index table as well. If you’re curious, you can insert some phone numbers and run, SELECT * FROM “^customers”; to see how the index was updated.
Not every query makes use of the...
Sehrch.com: A Structured Search Engine Powered By Hypertable
Sehrch.com is a structured search engine. It provides powerful querying capabilities that enable users to quickly complete complex information retrieval tasks. It gathers conceptual awareness from the Linked Open Data cloud, and can be used as (1) a regular search engine or (2) as a structured search engine. In both cases conceptual awareness is used to build entity centric result sets. Try this simple query: Pop singers less than 20 years old.
Sehrch.com gathers data from the Semantic Web in the form of RDF, crawling the Linked Open Data cloud and making requests with headers accepting RDF NTriples. Data dumps are also obtained from various sources. In order to store this data, we required a data store capable of storing tens of billions of triples using the least hardware while still delivering high performance. So we conducted our own study to find the most appropriate store for this type and quantity of data.
As Semantic Web people, our initial choice would have been to use native RDF data stores, better known as triplestores. But from our initial usage we quickly concluded that SPARQL compliant triplestores and large quantities of data do not mix well. As a challenge, we attempted to load 1.3 billion triples (the entire DBpedia and Freebase datasets) into a dual core machine with only 3GB memory. The furthest any of the open source triplestores (4store, TDB, Virtuoso) progressed to load the datasets upon the given hardware was around 80 million triples. We were told that the only solution was more hardware. We weren't the only ones facing significant hardware requirements when attempting to the load this volume of data. For example, in the following post a machine with 8 cores and 32GB...
Welcome to the new Hypertable website. This new website is easy to navigate and has all of the tools you'll need to learn about Hypertable and easily deploy it for your big data aplications.
We have put a tremendous amount of effort into the Documentation section of the website. There you'll find an Architectural Overview, Installation Instructions, Administrator Guides, Reference Guides and much more. Be sure check out the Code Examples section for working code examples written Java, C++, PHP, Python, Perl and Ruby.
We're also very excited to announce new products and services available today:
- UpTime Support Subscription – for round-the-clock, 7 days a week, 365 days a year "uptime" assurance for your Hypertable deployments -- with support stafflocated in the United States and Europe
- Training and Certification – taught by big data experts, Hypertable Training and Certification classes are held in Silicon Valley, USA and Munich, Germany
- Commercial License – for organizations such as OEMs, ISVs, and VARs who distribute Hypertable with their closed source products, Hypertable Inc. offers the software under a flexible OEM commercial license
Also, please check out the Hypertable vs. HBase Performance Evaluation II. In this in-house performance test, Hypertable demonstrates its power by easily loading 167 billion records into a 16-node test cluster, a test in which HBase failed, struggling with memory management.
If you're looking to take commercial advantage of the power of big data, then the Hypertable database platform is the right choice for you. Our new product and service offerings will ensure that you get the most out of your Hypertable deployment, allowing you to take full advantage of Hypertable's unprecedented performance and cost-efficiency...