Aster’s Big Data Architecture

September 3, 2010

As I mentioned in my last entry, the week before last I headed out to the TDWI World Conference in San Diego.  Besides talking about Dell’s new BI practice, I was there to represent our data analytics partners, Aster Data and Greenplum.  Both vendors also had booths of their own and I was able to grab some time with Jeff Zeisler, director of pre-sales engineers at Aster Data, to get an overview of their architecture.  Here’s what Jeff had to say:

Some of the ground Jeff covers:

  • Aster is a MPP (massively-parallel processing) data warehouse solution.  It runs on a cluster of commodity hardware that execute SQL queries in parallel.
  • The 3 layers to the architecture:
    • Queen tier – central location users use to submit queries. It figures out how to split up the query and send it to the next tier.
    • Worker tier – where most of the servers are located, where data is stored (locally on the servers) and where all the heavy lifting for processing occurs.  The map reduce framework is built into this tier and sits right next to the SQL execution engine.
    • Loader and exporter tier:  a separate tier of machines that can be used to load new data into the system for  bulk loading.
  • How it works: Query gets broken up across all the machines, they each execute some portion of the query and the result are brought back together at the Queen and returned to the user.
  • New cool things coming up in the next 6 months.

Extra:

Pau for now…


Dell has a BI practice?!

August 31, 2010

The week before last I headed out to The Data Warehouse Institute’s  (TDWI) World Conference in San Diego.  I went out to help support our BI team who were using the event as the forum to unveil Dell’s new Business Intelligence practice.

We got a bunch of puzzled looks as people approached the Dell booth and didn’t see any hardware.  Once however they learned what we were there to announce and why, they seemed to buy it (or maybe they just said they got it because they didn’t want to lose out on a chance to win the Dell Mini we were giving away 🙂

BI veteran, Mike Lampa, who has been driving the go-to market effort behind the practice acted as our chief spokesperson.   Here’s the message we were delivering, straight from Mike:

Some of the ground Mike covers:

  • Internally, Dell has one of the top 5 data warehouse implementations in world and we use most of the mainstream ETL, BI and database tools that are out there in the market.
  • The Perot acquisition has given us access to a global services delivery engine and we are marrying this channel with the BI expertise we’ve developed internally.
  • We’ll provide consulting services through our verticals and deliver end to end solutions targeted at vertical markets like Education, Health Care and Financial services.
  • Our goal is to do in services what we did in hardware, be  a disruptive force and bring in higher levels of innovation.

Extra Credit Reading

Pau for now…


Big Data in the Windy City

May 20, 2010

The Aqua building, catty corner from my hotel

Last Tuesday and Wednesday, I attended the TDWI (The Data Warehouse Institute) world conference in Chicago.  The show was a mix of courses and exhibit space.

I went to learn about the BI/Data warehousing segment and scout in preparation for the next conference in August.

Why BI?

My interest in the space comes from the fact that two of the three first partners in our Cloud Partner program are in the Data Warehousing and analytics space: Aster Data and Greenplum.  Both these partners are leveraging highly scaled-out architectures to crunch data.

While there, besides checking out the 24 companies on the exhibit floor, I attended three half-day classes: Developing your BI tool strategy, Cool BI, the latest innovations, Extending BI to support online marketing and Web 2.0.

For other newbies like myself, here are some notes from the first course.

My Notes: The layers of the BI Lifecycle stack

BI Suites:

  • What they do : Query, report, analyze, visualize, alert (front end to the chain)
  • The Big 4:  IBM (Cognos), SAP (Business Objects), Oracle (Hyperion), Microsoft
    • They all bought small players who excelled in the space
    • Usually offer the suites as part of a complete BI lifecycle stack
    • Two of the remaining independents are Microstrategy and SAS

Data Management

  • Data warehouse/mart databases and storage
  • Usually in a RDBMS but also in a dedicated OLAP database
  • Examples: Aster Data, Greenplum, Neteeza, Teradata

Data Integration (aka ETL)

  • They extract, transform and load info from the layer below into the layer above.
  • Examples: Informatica

Operational Apps/Systems

  • Planning, ERP, CRM etc
  • Orders, Invoices, Shipping, Web clicks

Extra-credit reading

Pau for now…


Datawarehouser Greenplum — Talking to President and Founder, Scott Yara

May 7, 2010

When I was out in the Bay Area for our launch I stopped by data warehouse and analytics player Greenplum.  Greenplum is one of the first three members in our Cloud Partner program (the other two are Canonical and Aster Data.)  I sat down with Greenplum’s President and founder Scott Yara to talk about the company and where they’re going:

Some to the topics Scott tackles:

  • Whats happening in the world of data.
  • How Greenplum began with the open source PostgreSQL database platform and over the last 7-8 years have refactored it and built a massively parallel database kernel engine.
  • How it works:  Greenplum takes the data and physically distributes it across all the Database segments and operates on the data in parallel.  This parallel approach allows Greenplum to process data 10-100x faster than conventional databases.
  • Who is using it: Skype, Fox Interactive, NTT docomo, Deutsche Bank, retailers, large healthcare companies.
  • The enterprise data cloud initiative – Setting a new type of analytics infrastructure that takes advantage of virtualization and the latest in general purpose and multi-core systems and is centered around self-service principles.
  • While a lot of folks are excited about writing apps to the iPhone, the platform that Scott and crew gest really excited about writing to are 2 socket Nehalem servers with a bunch of disk drives behind them.
  • How someone would go about getting started with Greenplum.

Extra Credit reading:

Pau for now…