Offshore Outsourcing Informatica ETL programming, data warehousing and data marts development in san Francisco

Informatica ETL programming

Informatica ETL programming

The market space for data extract, transform, and load (ETL) is paradoxical these days, marked simultaneously by crowding and consolidation. Two factors account for this condition: an increasing number of players entering the lucrative ETL market and a consolidation of existing players in overall data management. Database vendors such as Oracle and Microsoft are adding and augmenting ETL capabilities in the database, vendors such as Journee and Siperian are providing ETL-like capabilities under the guise of enterprise information integration, and companies such as SAS and SyncSort are bringing their data management expertise to bear in ETL solutions. At the same time, vendor ranks have thinned due to recent mergers, such as Sagent with Group 1 Software and Data Junction with Pervasive Software.

PowerCenter 7.0, the latest version of the flagship ETL solution from Informatica, is the company's attempt to stay competitive in this mixed market and maintain a frontline position in mind share and market share.

Three Levels of Labor

PowerCenter 7.0 (I'll call it PC7) is an ETL tool in the classic mold: data extract, transform, and load logic is constructed in a (mostly) sequential arrangement of graphical objects that flow from source to target. The objective is conceptually simple: Read data from source, transform it as needed, and write it to target. Reality is a little more complex, of course, and the construction of logic happens at three levels.

At the lowest level, individual graphical objects can be sources, targets, or transformations (sources and targets can be themselves considered as special types of transformations). A source transformation is used to read from a data source, and supply that data in sequential row-wise fashion for subsequent processing. At the other end of the logic stream, the target transformation receives data (again, in row-wise order) and writes it out to recipient data structures. The remaining intermediate transformations do just that — transform data values as required.

Sources, targets, and transformations are assembled in a daisy chain to form the next level of processing, which in PC7 is called the "mapping." A mapping is the end-to-end flow of logic, from one or more source transformation to one or more target transformations.

The execution of the mapping, called the "workflow" in PC7, provides the third level of the overall logic. The workflow provides for the execution of multiple mappings and dependencies among mappings. In standard programming terms, the transforms are the syntax and components of the program, the mapping is the overall program itself, and the workflow is the execution and production of one or more programs.

There are PC7 components that correspond to these levels. The PowerCenter Designer is the programming integrated development environment (IDE), where you "assemble" all the sources, targets, and transformations to create a mapping. The PowerCenter Workflow Manager is used to build a workflow around the mapping. The Workflow Monitor provides production support capabilities for the workflow. In addition, there are the PowerCenter Repository Manager and the Repository Server Manager, which provide administration capabilities for the PC7 Repository (more on the this a little later).

Conventional Improvements

A key measure of an ETL tool's strength is the number of sources and targets it supports, and the variety and performance of the transformations. PC7 supports a wide variety of data sources, such as relational (using native connectivity), ODBC, XML, and fixed-width and delimited flat files. The acquisition of Striva, a leader in mainframe connectivity tools, adds to Informatica's connectivity repertoire. In addition, Informatica PowerExchange (formerly called PowerConnect) is a family of gateways that allow access to applications such as SAP, Siebel, and PeopleSoft and to middleware and other solutions such as Tibco, IBM MQSeries, webMethods, and SAS. PC7 also introduces bidirectional support for Web services and allows PC7 to act as a provider as well as consumer of Web services. Informatica PowerChannel provides a secure extension to PC7 for purposes of data transfer across wide area networks and the Internet, by incorporating encryption and authentication technology from RSA, a leading security vendor. Together with the means to read and write data, PC7 provides numerous transformations that let you cleanse, transform, aggregate, and segregate data as needed, as well as apply data and business rules.

Data Warehouse Solutions

· e-Business Intelligence
· Re-Engineering of DW applications
· CRM Data Warehouse
· Data Warehousing Implementation
· Data delivery applications
· Data integration

Case Studies
· Business Management System
· Data Warehouse WAP System
· Cell-phone Users Survey
· BI Platform Development
· Data Warehouse Reporting Portal

Data warehouse Technologies

OLAP tools:MicroStrategy, Brio, Business Objects, Cognos, Hyperion, Informix Metacube

ETL tools: Informatica, Datastage, Datajunction, DataMirror

Databases: Oracle, MS SQL Server 2000, Sybase, DB2, Informix, Redbrick, Teradata;

Datamining Tools: SAS Miner, Intelligent Miner;

End to end tools: SAP Business Warehouse suite and Oracle Data Warehousing product suite


» Offshore Application Development Framework

Real Time Database development outsourcing