Your Ad Here

IBM rig doesn’t seem like much, scans 10 billion files in 43 minutes

Someone must gift these IBM researchers an improved camera, because their latest General Parallel File System is a back-slapping 37 times faster than their last effort back in 2007. The rig combines ten IBM System xSeries servers with Violin Memory SSDs that hold 6.5 terabytes of metadata in terms of 10 billion separate files. Each a kind of files may be analyzed and managed using policy-guided rules in under three quarters of an hour. That sort of performance might sound like overkill, but it is only just barely according to what IBM’s Doug Balog describes as a “rapidly growing, multi-zettabyte world.” No prizes for guessing who their top customer is perhaps. Full details inside the PR after the break.

Show full PR text

Made in IBM Labs: Researchers Demonstrate Breakthrough Storage Performance for large Data Applications

IBM GPFS Storage Technology Scans 10 Billion Files in 43 Minutes

SAN JOSE, CA, July 22, 2011: Researchers from IBM (NYSE: HYPERLINK “http://www.ibm.com/investor” IBM) today demonstrated the way forward for large-scale storage systems by successfully scanning 10 billion files on a single system in precisely 43 minutes, shattering the HYPERLINK “http://www-03.ibm.com/press/us/en/pressrelease/22405.wss” previous record of 1 billion files in three hours by an element of 37.

Growing at unprecedented scales, this advance unifies data environments on a single platform, as opposed to being distributed across several systems that needs to be separately managed. It also dramatically reduces and simplifies data management tasks, allowing additional information to be stored within the same technology, in place of continuing to shop a growing number of storage.

In 1998, IBM Researchers unveiled a highly scalable, clustered parallel file system called General Parallel File System ( HYPERLINK “http://www-03.ibm.com/systems/software/gpfs/” GPFS), which was furthered tuned to make this breakthrough possible. GPFS represents a tremendous advance of scaling for storage performance and capacity, while keeping management costs flat. This innovation could help organizations deal with the exploding growth of information, transactions and digitally-aware sensors and other devices that comprise Smarter Planet systems. It’s far splendid for applications requiring high-speed access to massive volumes of information similar to data mining to decide customer buying behaviors across massive data sets, seismic data processing, risk management and fiscal analysis, weather modeling and scientific research.

Driving New Levels of Storage Performance

Today’s breakthrough was achieved using GPFS running on a cluster of 10 eight core systems and solid state storage, taking 43 minutes to accomplish this option. The GPFS management rules engine provides the great capabilities to service any data management task.

GPFS’s advanced algorithm makes possible the complete use of all processor cores on all of those machines in all phases of the duty (data read, sorting and rules evaluation). GPFS exploits the cast state storage appliances with only 6.8 terabytes of capacity for nice random performance and high data transfer rates for holding the metadata storage. The appliances sustainably perform hundreds of millions of knowledge input-output operations, while GPFS continuously identifies, selects and sorts the perfect set of files one of many 10 billion at the system.

“Today’s demonstration of GPFS scalability will pave the style for brand spanking new products that address the challenges of a rapidly growing, multi-zettabyte world,” said Doug Balog, vp, storage platforms, IBM. “This has the possible to enable much larger data environments to be unified on a single platform and dramatically reduce and simplify data management tasks which include data placement, aging, backup and migration of individual files.”

The former record was also set by IBM researchers on the Supercomputing 2007 conference in Reno, NV, where they demonstrated the power to scan a thousand million files in three hours.

“Businesses in every industry want to the way forward for storage and information management as we are facing an issue springing from the very core of our success – managing the big amounts of knowledge we create every day,” said Bruce Hillsberg, director of storage systems, IBM Research – Almaden. “From banking systems to MRIs and traffic sensors, our day-to-day lives are engulfed in data. But, it may possibly only be useful whether it is effectively stored, analyzed and applied, and businesses and governments have trusted smarter technology systems because the means to cope and leverage the constant influx of information and switch it into valuable insights.”

IBM Research continues to develop innovative storage technologies to assist clients not just manage data proliferation, but harness data to create new services. Earlier year alone, IBM storage products included over five significant storage innovations invented by IBM Research including IBM Easy Tier, Storwize V7000, Scale-out Network Attached Storage (SONAS), IBM Information Archive and IBM Long time File System (LTFS).

Because the size of digital data increased 47 percent over last year, businesses are under tremendous pressure to quickly turn data into actionable insights, but grapple with find out how to manage and store all of it. As new applications emerge in industries from financial services to healthcare, traditional data management systems will not be able to accomplish common but critical storage management tasks, leaving organizations exposed to critical data loss.

Anticipating these storage challenges decades ago, researchers from IBM Research – Almaden created GPFS to assist businesses manage the exploding growth of information, transactions and digitally-aware devices on a single system. Already deployed to accomplish tasks like backup, information lifecycle management, disaster recovery and content distribution, this technology’s new angle overcomes the challenge of managing unprecedented large file systems with the integration of multi-system parallelization and fast access to file system metadata stored on an exceptional state storage appliance.

Additional details at the breakthrough are located at here .

Source

  • Twitter
  • Facebook
  • email
  • PDF
  • Digg
  • del.icio.us
  • Google Bookmarks
  • RSS

This post is tagged: , , , , ,

Leave a Reply





  • Robot navigates, reassembles truss structuresRobot navigates, reassembles truss structures

    Sick and bored with your boring old truss? This useful little robot might be just the answer you are looking for. It might navigate a truss structure using its 3D-printed bi-directional gear innards, unscrew a beam with its rotational mechanism and reattach it, transforming the structure right into a new shape. The structure itself is specially designed for the bot, with robot lockable… »
  • Apple patent application points to DJ-like beat matching, pairs iTunes with fist pumpsApple patent application points to DJ-like beat matching, pairs iTunes with fist pumps

    Once upon a less digital time, there existed the art of the mixtape: a tedious labor of affection that required timing, taste and a penchant for musical progression. Now not on this iTunes -era, where personally curated song collections that when served because the background to our lives can now be automated by our dear friends in Cupertino. And, in line with a patent application … »

Categories

Subscribe

Enter your email address: