Fujitsu and Japan's National Institute of Genetics are working on building what they expect will be the world's fastest database when it opens later this year.
A prototype of the system based on Fujitsu's Shunsaku XML database engine has already been completed and is currently undergoing in-house testing at the genetics institute, which is also known as Idenken in Japan.
Idenken's database is one of the world's three main genetics databases and it is a repository for data from all genome projects conducted by Japan's government in addition to all public-domain data from the Japan Patent Office. It currently includes 35m records including the DNA pattern of 39.8 billion bases and its size is doubling every year.
More than 10,000 users consult the database each day, making speedy searches a top priority for Idenken. Its current system is based on a relational database and takes around 10 minutes to complete a two- or three-keyword search. The prototype system has already slashed the search time to around 5 seconds, said Osamu Akiba, director of Fujitsu's Triole Business Development Centre. He demonstrated the system at the Fujitsu Solution Forum event in Tokyo last week.
The secret to Shunsaku's speed is a search algorithm that means it doesn't require an index. Each search is done in real-time and new documents can begin appearing in search results as soon as they are added to the database, said Nick Hayashi, a spokesman for Fujitsu in Tokyo.
The fact that the Idenken database is constantly growing means the relational database index always needs to be updated. If it can't keep up with the speed at which new information is being added the result is a much slower search, said Hayashi.
Because Shunsaku is always working on the database in real time such problems do not affect it, he said.
Part of the ongoing work between Fujitsu and Idenken will cover optimizing Shunsaku, which was originally designed for high-speed processing of text searches, to better handle complex data such as that found in the biotechnology field.
"We created the prototype to copy the functions of the existing database and are adding functions to it," said Hayashi. "We are going to enhance it further and it may become faster, maybe 200 times faster (than the current relational database)."
Shunsaku is already available in Japan under the name "Interstage Shunsaku Data Manager Enterprise Edition" and Fujitsu plans to put in on sale elsewhere in the world later this year, said Hayashi.