The extract-transform-load (ETL) system, or more informally, the "back room," is often estimated to consume 70 percent of the time and effort of building a data warehouse. But there hasn't been enough careful thinking about just why the ETL system is so complex and resource intensive. Everyone understands the three letters: You get the data out of its original source location (E), you do something to it (T), and then you load it (L) into a final set of tables for the users to query.
When asked about breaking down the three big steps, many designers say, "Well, that depends." It depends on the source, it depends on funny data idiosyncrasies, it depends on the scripting languages and ETL tools available, it depends on the skills of the in-house staff, and it depends on the query and reporting tools the end users have.
The "it depends" response is dangerous because it becomes an excuse to roll your own ETL system, which in the worst-case scenario results in an undifferentiated spaghetti-mess of tables, modules, processes, scripts, triggers, alerts, and job schedules. Maybe this kind of creative design approach was appropriate a few years ago when everyone was struggling to understand the ETL task, but with the benefit of thousands of successful data warehouses, a set of best practices is ready to emerge.
I have spent the last 18 months intensively studying ETL practices and ETL products. I have identified a list of 38 subsystems that are needed in almost every data warehouse back room. That's the bad news. No wonder the ETL system takes such a large fraction of the data warehouse resources. But the good news is that if you study the list, you'll recognize almost all of them, and you'll be on the way to leveraging your experience in each of these subsystems as you build successive data warehouses.
The 38 Subsystems
1. Extract system. Source data adapters, push/pull/dribble job schedulers, filtering and sorting at the source, proprietary data format conversions, and data staging after transfer to ETL environment.
2. Change data capture system. Source log file readers, source date and sequence number filters, and CRC-based record comparison in ETL system.
3. Data profiling system. Column property analysis including discovery of inferred domains, and structure analysis including candidate foreign key — primary relationships, data rule analysis, and value rule analysis.
4. Data cleansing system. Typically a dictionary driven system for complete parsing of names and addresses of individuals and organizations, possibly also products or locations. "De-duplication" including identification and removal usually of individuals and organizations, possibly products or locations. Often uses fuzzy logic. "Surviving" using specialized data merge logic that preserves specified fields from certain sources to be the final saved versions. Maintains back references (such as natural keys) to all participating original sources.
5. Data conformer. Identification and enforcement of special conformed dimension attributes and conformed fact table measures as the basis for data integration across multiple data sources.
6. Audit dimension assembler. Assembly of metadata context surrounding each fact table load in such a way that the metadata context can be attached to the fact table as a normal dimension.
7. Quality screen handler. In line ETL tests applied systematically to all data flows checking for data quality issues. One of the feeds to the error event handler (see subsystem 8).
8. Error event handler. Comprehensive system for reporting and responding to all ETL error events. Includes branching logic to handle various classes of errors, and includes real-time monitoring of ETL data quality
9. Surrogate key creation system. Robust mechanism for producing stream of surrogate keys, independently for every dimension. Independent of database instance, able to serve distributed clients.
10. Slowly Changing Dimension (SCD) processor. Transformation logic for handling three types of time variance possible for a dimension attribute: Type 1 (overwrite), Type 2 (create new record), and Type 3 (create new field).
11. Late arriving dimension handler. Insertion and update logic for dimension changes that have been delayed in arriving at the data warehouse.
12. Fixed hierarchy dimension builder. Data validity checking and maintenance system for all forms of many-to-one hierarchies in a dimension.
13. Variable hierarchy dimension builder. Data validity checking and maintenance system for all forms of ragged hierarchies of indeterminate depth, such as organization charts, and parts explosions.
14. Multivalued dimension bridge table builder. Creation and maintenance of associative (bridge) table used to describe a many-to-many relationship between dimensions. May include weighting factors used for allocations and situational role descriptions.
15. Junk dimension builder. Creation and maintenance of dimensions consisting of miscellaneous low cardinality flags and indicators found in most production data sources.
16. Transaction grain fact table loader. System for updating transaction grain fact tables including manipulation of indexes and partitions. Normally append mode for most recent data. Uses surrogate key pipeline (see subsystem 19).
17. Periodic snapshot grain fact table loader. System for updating periodic snapshot grain fact tables including manipulation of indexes and partitions. Includes frequent overwrite strategy for incremental update of current period facts. Uses surrogate key pipeline (see subsystem 19).
18. Accumulating snapshot grain fact table loader. System for updating accumulating snapshot grain fact tables including manipulation of indexes and partitions, and updates to both dimension foreign keys and accumulating measures. Uses surrogate key pipeline (see subsystem 19).
19. Surrogate key pipeline. Pipelined, multithreaded process for replacing natural keys of incoming data with data warehouse surrogate keys.
20. Late arriving fact handler. Insertion and update logic for fact records that have been delayed in arriving at the data warehouse.
21. Aggregate builder. Creation and maintenance of physical database structures, known as aggregates, that are used in conjunction with a query-rewrite facility, to improve query performance. Includes stand-alone aggregate tables and materialized views.
22. Multidimensional cube builder. Creation and maintenance of star schema foundation for loading multidimensional (OLAP) cubes, including special preparation of dimension hierarchies as dictated by the specific cube technology.
23. Real-time partition builder. Special logic for each of the three fact table types (see subsystems 16, 17, and 18) that maintains a "hot partition" in memory containing only the data that has arrived since the last update of the static data warehouse tables.
24. Dimension manager system. Administration system for the "dimension manager" who replicates conformed dimensions from a centralized location to fact table providers. Paired with subsystem 25.
25. Fact table provider system. Administration system for the "fact table provider" who receives conformed dimensions sent by the dimension manager. Includes local key substitution, dimension version checking, and aggregate table change management.
26. Job scheduler. System for scheduling and launching all ETL jobs. Able to wait for a wide variety of system conditions including dependencies of prior jobs completing successfully. Able to post alerts.
27. Workflow monitor. Dashboard and reporting system for all job runs initiated by the Job Scheduler. Includes number of records processed, summaries of errors, and actions taken.
28. Recovery and restart system. Common system for resuming a job that has halted, or for backing out a whole job and restarting. Significant dependency on backup system (see subsystem 36).
29. Parallelizing/pipelining system. Common system for taking advantage of multiple processors, or grid computing resources, and common system for implementing streaming data flows. Highly desirable (eventually necessary) that parallelizing and pipelining be invoked automatically for any ETL process that meets certain conditions, such as not writing to the disk or waiting on a condition in the middle of the process.
30. Problem escalation system. Automatic plus manual system for raising an error condition to the appropriate level for resolution and tracking. Includes simple error log entries, operator notification, supervisor notification, and system developer notification.
31. Version control system. Consistent "snapshotting" capability for archiving and recovering all the metadata in the ETL pipeline. Check-out and check-in of all ETL modules and jobs. Source comparison capability to reveal differences between different versions.
32. Version migration system. development to test to production. Move a complete ETL pipeline implementation out of development, into test, and then into production. Interface to version control system to back out a migration. Single interface for setting connection information for entire version. Independence from database location for surrogate key generation.
33. Lineage and dependency analyzer. Display the ultimate physical sources and all subsequent transformations of any selected data element, chosen either from the middle of the ETL pipeline, or chosen on a final delivered report (lineage). Display all affected downstream data elements and final report fields affected by a potential change in any selected data element, chosen either in the middle of the ETL pipeline, or in an original source (dependency).
34. Compliance reporter. Comply with regulatory statutes to prove the lineage of key reported operating results. Prove that the data and the transformations haven't been changed. Show who has accessed or changed any such data.
35. Security system. Administer role-based security on all data and metadata in the ETL pipeline. Prove that a version of a module hasn't been changed. Show who has made changes.
36. Backup system. Backup data and metadata for recovery, restart, security, and compliance requirements.
37. Metadata repository manager. Comprehensive system for capturing and maintaining all ETL metadata, including all transformation logic. Includes process metadata, technical metadata, and business metadata.
38. Project management system. Comprehensive system for keeping track of all ETL development.
Tuesday, January 31, 2012
Ab Initio Best Practices
These are general guidelines that is ideal to implement in Ab Initio projects involving development, maintenance, testing activities. These are tips collected from various other sources from the net as well as from expert Ab Initio developers.
Project access control - Checking In and Checking out practices
* Before “Checking In” any graphs make sure that it has been deployed successfully.
* Also before “Checking In” inform the ETL Admin about the same.
* To obtain the latest version of the graph “Check Out” from EME Data store.
* Before running a graph “Check Out” from EME Data store to your individual sand box. In case the graph is not present in the EME Data store “Check In” and then run it.
* The Abinitio Sand Box for all authorized users should be created only by the ETL Admin.
* Before creating graphs on the server ensure that the User-ID, Password in the EME Settings and the Run Settings are the same.
* Before modifying a graph ensure that it is locked to prevent any sharing conflicts. When you lock a graph you prevent other users modifying it at the same time. It is advisable that individual graphs are handled by separate users.
* Do not create any table in the target database. In case it is needed, ask the DBA to do so.
* Any database related activities and problems should be reported to the concerned DBA immediately.
* Before you need to modify any table in the target database inform the concerned DBA and get his approval.
* Do not change any of the environment variables. As these environment variables are global to all graphs they should not be tampered with. Only the ETL Admin has rights to set or modify the environment variables.
Good practices for project implementation
* While running a graph one may encounter errors. Hence maintain error logs for every error you come across. A consolidated, detailed error sheet should be maintained containing error related and resolution information of all users. This can be used for reference when facing similar errors later on. In case you have a database error contact the DBA immediately.
* Ensure that you are using the relevant dbc file in all your graphs.
* Always validate a graph before executing it and ensure that it validates successfully. Deploy the graph after successful validation.
* ab_project_setup.ksh should be executed on regular basis. Contact ETL Admin for further details.
* Before running a graph check whether the test parameters are valid.
* After implementing the desired modifications save and unlock the graph.
Handling run time related errors
* If you are testing a graph created by some one else contact the person who created the graph or the person who made recent modifications to it. He will assist you or himself perform the needful.
* If the error encountered relates to an Admin settings problem contact the ETL Admin immediately.
* If you face a problem that you have not encountered and resolved before, look in to the consolidated error sheet and check to see whether that problem has been previously faced and resolved by any other user. You can also approach various online tech forums to get further input on the error.
Documentation practices
* Maintain documents regarding all the modifications performed on existing graphs or scripts.
* Maintain ETL design documents for all graphs created or modified. The documents should be modified accordingly if any changes are performed on the existing graphs.
* While testing any graph follow the testing rules as per the testing template. Maintain documents for all testing activities performed.
What is good about underlying tables
* Ensure that in all the graphs where we are using RDBMS tables as input, the join condition is on indexed columns. If not then ensure that indexes are created on the columns that are used in the join condition. This is very important because if indexes are absent then there would be full table scan thereby resulting in very poor performance. Before execution of any graph use Oracle's Explain Plan utility to find the execution path of query.
* Ensure that if there are indexes on target table, then they are dropped before running the graph and recreated after the graph is run.
* If possible try to shift the sorting or aggregating of data to the source tables (provided you are using RDBMS as a source and not a flat file). SQL order by or group by clause will be much faster than Ab Initio because invariably the database server would be more powerful than Ab Initio server (even otherwise SQL order by or group by is done efficiently (compared to any ETL tool) because Oracle runs the statement in optimal mode.
* Bitmap indexes may not be created on tables that are updated frequently. Bitmap indexes tend to occupy a lot of disk space. Instead a normal index (B-tree index) may be created.
DML & XFR Usage
* Do not embed the DML if it belongs to a landed file or if it is going to be reused in another graph. Create DML files and specify as path.
* Do not embed the XFR if it is going to be re-used in another graph. Create XFR files and specify as path.
Efficient usage of components
* Skinny the file, if the source file contains more data elements than what you need for down stream processing. Add a Reformat as your first component to eliminate any data elements that are not needed for down stream processing.
* Apply any filter criteria as early in the flow as possible. This will reduce the number of records you will need to process early in the flow.
* Apply any Rollup’s early in the flow as possible. This will reduce the number of records you will need to process early in the flow.
* Separate out the functionality between components. If you need to perform a reformat and filter on some data, use a reformat component and a filter component. Do not perform Reformat and filter in the same component. If you have a justifiable reason to merge functionality then specify the same in component description.
Project access control - Checking In and Checking out practices
* Before “Checking In” any graphs make sure that it has been deployed successfully.
* Also before “Checking In” inform the ETL Admin about the same.
* To obtain the latest version of the graph “Check Out” from EME Data store.
* Before running a graph “Check Out” from EME Data store to your individual sand box. In case the graph is not present in the EME Data store “Check In” and then run it.
* The Abinitio Sand Box for all authorized users should be created only by the ETL Admin.
* Before creating graphs on the server ensure that the User-ID, Password in the EME Settings and the Run Settings are the same.
* Before modifying a graph ensure that it is locked to prevent any sharing conflicts. When you lock a graph you prevent other users modifying it at the same time. It is advisable that individual graphs are handled by separate users.
* Do not create any table in the target database. In case it is needed, ask the DBA to do so.
* Any database related activities and problems should be reported to the concerned DBA immediately.
* Before you need to modify any table in the target database inform the concerned DBA and get his approval.
* Do not change any of the environment variables. As these environment variables are global to all graphs they should not be tampered with. Only the ETL Admin has rights to set or modify the environment variables.
Good practices for project implementation
* While running a graph one may encounter errors. Hence maintain error logs for every error you come across. A consolidated, detailed error sheet should be maintained containing error related and resolution information of all users. This can be used for reference when facing similar errors later on. In case you have a database error contact the DBA immediately.
* Ensure that you are using the relevant dbc file in all your graphs.
* Always validate a graph before executing it and ensure that it validates successfully. Deploy the graph after successful validation.
* ab_project_setup.ksh should be executed on regular basis. Contact ETL Admin for further details.
* Before running a graph check whether the test parameters are valid.
* After implementing the desired modifications save and unlock the graph.
Handling run time related errors
* If you are testing a graph created by some one else contact the person who created the graph or the person who made recent modifications to it. He will assist you or himself perform the needful.
* If the error encountered relates to an Admin settings problem contact the ETL Admin immediately.
* If you face a problem that you have not encountered and resolved before, look in to the consolidated error sheet and check to see whether that problem has been previously faced and resolved by any other user. You can also approach various online tech forums to get further input on the error.
Documentation practices
* Maintain documents regarding all the modifications performed on existing graphs or scripts.
* Maintain ETL design documents for all graphs created or modified. The documents should be modified accordingly if any changes are performed on the existing graphs.
* While testing any graph follow the testing rules as per the testing template. Maintain documents for all testing activities performed.
What is good about underlying tables
* Ensure that in all the graphs where we are using RDBMS tables as input, the join condition is on indexed columns. If not then ensure that indexes are created on the columns that are used in the join condition. This is very important because if indexes are absent then there would be full table scan thereby resulting in very poor performance. Before execution of any graph use Oracle's Explain Plan utility to find the execution path of query.
* Ensure that if there are indexes on target table, then they are dropped before running the graph and recreated after the graph is run.
* If possible try to shift the sorting or aggregating of data to the source tables (provided you are using RDBMS as a source and not a flat file). SQL order by or group by clause will be much faster than Ab Initio because invariably the database server would be more powerful than Ab Initio server (even otherwise SQL order by or group by is done efficiently (compared to any ETL tool) because Oracle runs the statement in optimal mode.
* Bitmap indexes may not be created on tables that are updated frequently. Bitmap indexes tend to occupy a lot of disk space. Instead a normal index (B-tree index) may be created.
DML & XFR Usage
* Do not embed the DML if it belongs to a landed file or if it is going to be reused in another graph. Create DML files and specify as path.
* Do not embed the XFR if it is going to be re-used in another graph. Create XFR files and specify as path.
Efficient usage of components
* Skinny the file, if the source file contains more data elements than what you need for down stream processing. Add a Reformat as your first component to eliminate any data elements that are not needed for down stream processing.
* Apply any filter criteria as early in the flow as possible. This will reduce the number of records you will need to process early in the flow.
* Apply any Rollup’s early in the flow as possible. This will reduce the number of records you will need to process early in the flow.
* Separate out the functionality between components. If you need to perform a reformat and filter on some data, use a reformat component and a filter component. Do not perform Reformat and filter in the same component. If you have a justifiable reason to merge functionality then specify the same in component description.
Ab Initio Interview Questions
What is the relation between EME , GDE and Co-operating system ?
ans. EME is said as enterprise metdata env, GDE as graphical devlopment env and Co-operating sytem can be said as asbinitio server
relation b/w this CO-OP, EME AND GDE is as fallows
Co operating system is the Abinitio Server. this co-op is installed on perticular O.S platform that is called NATIVE O.S .comming to the EME, its i just as repository in informatica , its hold the metadata,trnsformations,db config files source and targets informations. comming to GDE its is end user envirinment where we can devlop the graphs(mapping just like in informatica)
desinger uses the GDE and designs the graphs and save to the EME or Sand box it is at user side.where EME is ast server side.
What is the use of aggregation when we have rollup
as we know rollup component in abinitio is used to summirize group of data record. then where we will use aggregation ?
ans: Aggregation and Rollup both can summerise the data but rollup is much more convenient to use. In order to understand how a particular summerisation being rollup is much more explanatory compared to aggregate. Rollup can do some other functionalities like input and output filtering of records.
Aggregate and rollup perform same action, rollup display intermediat
result in main memory, Aggregate does not support intermediat result
what are kinds of layouts does ab initio supports
Basically there are serial and parallel layouts supported by AbInitio. A graph can have both at the same time. The parallel one depends on the degree of data parallelism. If the multi-file system is 4-way parallel then a component in a graph can run 4 way parallel if the layout is defined such as it's same as the degree of parallelism.
How can you run a graph infinitely?
To run a graph infinitely, the end script in the graph should call the .ksh file of the graph. Thus if the name of the graph is abc.mp then in the end script of the graph there should be a call to abc.ksh.
Like this the graph will run infinitely.
How do you add default rules in transformer?
Double click on the transform parameter of parameter tab page of component properties, it will open transform editor. In the transform editor click on the Edit menu and then select Add Default Rules from the dropdown. It will show two options - 1) Match Names 2) Wildcard.
Do you know what a local lookup is?
If your lookup file is a multifile and partioned/sorted on a particular key then local lookup function can be used ahead of lookup function call. This is local to a particular partition depending on the key.
Lookup File consists of data records which can be held in main memory. This makes the transform function to retrieve the records much faster than retirving from disk. It allows the transform component to process the data records of multiple files fastly.
What is the difference between look-up file and look-up, with a relevant example?
Generally Lookup file represents one or more serial files(Flat files). The amount of data is small enough to be held in the memory. This allows transform functions to retrive records much more quickly than it could retrive from Disk.
A lookup is a component of abinitio graph where we can store data and retrieve it by using a key parameter.
A lookup file is the physical file where the data for the lookup is stored.
How many components in your most complicated graph? It depends the type of components you us.
usually avoid using much complicated transform function in a graph.
Explain what is lookup?
Lookup is basically a specific dataset which is keyed. This can be used to mapping values as per the data present in a particular file (serial/multi file). The dataset can be static as well dynamic ( in case the lookup file is being generated in previous phase and used as lookup file in current phase). Sometimes, hash-joins can be replaced by using reformat and lookup if one of the input to the join contains less number of records with slim record length.
AbInitio has built-in functions to retrieve values using the key for the lookup
What is a ramp limit?
The limit parameter contains an integer that represents a number of reject events
The ramp parameter contains a real number that represents a rate of reject events in the number of records processed.
no of bad records allowed = limit + no of records*ramp.
ramp is basically the percentage value (from 0 to 1)
This two together provides the threshold value of bad records.
Have you worked with packages?
Multistage transform components by default uses packages. However user can create his own set of functions in a transfer function and can include this in other transfer functions.
Have you used rollup component? Describe how.
If the user wants to group the records on particular field values then rollup is best way to do that. Rollup is a multi-stage transform function and it contains the following mandatory functions.
1. initialise
2. rollup
3. finalise
Also need to declare one temporary variable if you want to get counts of a particular group.
For each of the group, first it does call the initialise function once, followed by rollup function calls for each of the records in the group and finally calls the finalise function once at the end of last rollup call.
How do you add default rules in transformer?
Add Default Rules — Opens the Add Default Rules dialog. Select one of the following: Match Names — Match names: generates a set of rules that copies input fields to output fields with the same name. Use Wildcard (.*) Rule — Generates one rule that copies input fields to output fields with the same name.
)If it is not already displayed, display the Transform Editor Grid.
2)Click the Business Rules tab if it is not already displayed.
3)Select Edit > Add Default Rules.
In case of reformat if the destination field names are same or subset of the source fields then no need to write anything in the reformat xfr unless you dont want to use any real transform other than reducing the set of fields or split the flow into a number of flows to achive the functionality.
What is the difference between partitioning with key and round robin?
Partition by Key or hash partition -> This is a partitioning technique which is used to partition data when the keys are diverse. If the key is present in large volume then there can large data skew. But this method is used more often for parallel data processing.
Round robin partition is another partitioning technique to uniformly distribute the data on each of the destination data partitions. The skew is zero in this case when no of records is divisible by number of partitions. A real life example is how a pack of 52 cards is distributed among 4 players in a round-robin manner.
How do you improve the performance of a graph?
There are many ways the performance of the graph can be improved.
1) Use a limited number of components in a particular phase
2) Use optimum value of max core values for sort and join components
3) Minimise the number of sort components
4) Minimise sorted join component and if possible replace them by in-memory join/hash join
5) Use only required fields in the sort, reformat, join components
6) Use phasing/flow buffers in case of merge, sorted joins
7) If the two inputs are huge then use sorted join, otherwise use hash join with proper driving port
8) For large dataset don't use broadcast as partitioner
9) Minimise the use of regular expression functions like re_index in the trasfer functions
10) Avoid repartitioning of data unnecessarily
Try to run the graph as long as possible in MFS. For these input files should be partitioned and if possible output file should also be partitioned.
How do you truncate a table?
From Abinitio run sql component using the DDL "trucate table
By using the Truncate table component in Ab Initio
Have you eveer encountered an error called "depth not equal"?
When two components are linked together if their layout doesnot match then this problem can occur during the compilation of the graph. A solution to this problem would be to use a partitioning component in between if there was change in layout.
What is the function you would use to transfer a string into a decimal?
In this case no specific function is required if the size of the string and decimal is same. Just use decimal cast with the size in the transform function and will suffice. For example, if the source field is defined as string(8) and the destination as decimal(8) then (say the field name is field1).
out.field :: (decimal(8)) in.field
If the destination field size is lesser than the input then use of string_substring function can be used likie the following.
say destination field is decimal(5).
out.field :: (decimal(5))string_lrtrim(string_substring(in.field,1,5)) /* string_lrtrim used to trim leading and trailing spaces */
What are primary keys and foreign keys?
ans. EME is said as enterprise metdata env, GDE as graphical devlopment env and Co-operating sytem can be said as asbinitio server
relation b/w this CO-OP, EME AND GDE is as fallows
Co operating system is the Abinitio Server. this co-op is installed on perticular O.S platform that is called NATIVE O.S .comming to the EME, its i just as repository in informatica , its hold the metadata,trnsformations,db config files source and targets informations. comming to GDE its is end user envirinment where we can devlop the graphs(mapping just like in informatica)
desinger uses the GDE and designs the graphs and save to the EME or Sand box it is at user side.where EME is ast server side.
What is the use of aggregation when we have rollup
as we know rollup component in abinitio is used to summirize group of data record. then where we will use aggregation ?
ans: Aggregation and Rollup both can summerise the data but rollup is much more convenient to use. In order to understand how a particular summerisation being rollup is much more explanatory compared to aggregate. Rollup can do some other functionalities like input and output filtering of records.
Aggregate and rollup perform same action, rollup display intermediat
result in main memory, Aggregate does not support intermediat result
what are kinds of layouts does ab initio supports
Basically there are serial and parallel layouts supported by AbInitio. A graph can have both at the same time. The parallel one depends on the degree of data parallelism. If the multi-file system is 4-way parallel then a component in a graph can run 4 way parallel if the layout is defined such as it's same as the degree of parallelism.
How can you run a graph infinitely?
To run a graph infinitely, the end script in the graph should call the .ksh file of the graph. Thus if the name of the graph is abc.mp then in the end script of the graph there should be a call to abc.ksh.
Like this the graph will run infinitely.
How do you add default rules in transformer?
Double click on the transform parameter of parameter tab page of component properties, it will open transform editor. In the transform editor click on the Edit menu and then select Add Default Rules from the dropdown. It will show two options - 1) Match Names 2) Wildcard.
Do you know what a local lookup is?
If your lookup file is a multifile and partioned/sorted on a particular key then local lookup function can be used ahead of lookup function call. This is local to a particular partition depending on the key.
Lookup File consists of data records which can be held in main memory. This makes the transform function to retrieve the records much faster than retirving from disk. It allows the transform component to process the data records of multiple files fastly.
What is the difference between look-up file and look-up, with a relevant example?
Generally Lookup file represents one or more serial files(Flat files). The amount of data is small enough to be held in the memory. This allows transform functions to retrive records much more quickly than it could retrive from Disk.
A lookup is a component of abinitio graph where we can store data and retrieve it by using a key parameter.
A lookup file is the physical file where the data for the lookup is stored.
How many components in your most complicated graph? It depends the type of components you us.
usually avoid using much complicated transform function in a graph.
Explain what is lookup?
Lookup is basically a specific dataset which is keyed. This can be used to mapping values as per the data present in a particular file (serial/multi file). The dataset can be static as well dynamic ( in case the lookup file is being generated in previous phase and used as lookup file in current phase). Sometimes, hash-joins can be replaced by using reformat and lookup if one of the input to the join contains less number of records with slim record length.
AbInitio has built-in functions to retrieve values using the key for the lookup
What is a ramp limit?
The limit parameter contains an integer that represents a number of reject events
The ramp parameter contains a real number that represents a rate of reject events in the number of records processed.
no of bad records allowed = limit + no of records*ramp.
ramp is basically the percentage value (from 0 to 1)
This two together provides the threshold value of bad records.
Have you worked with packages?
Multistage transform components by default uses packages. However user can create his own set of functions in a transfer function and can include this in other transfer functions.
Have you used rollup component? Describe how.
If the user wants to group the records on particular field values then rollup is best way to do that. Rollup is a multi-stage transform function and it contains the following mandatory functions.
1. initialise
2. rollup
3. finalise
Also need to declare one temporary variable if you want to get counts of a particular group.
For each of the group, first it does call the initialise function once, followed by rollup function calls for each of the records in the group and finally calls the finalise function once at the end of last rollup call.
How do you add default rules in transformer?
Add Default Rules — Opens the Add Default Rules dialog. Select one of the following: Match Names — Match names: generates a set of rules that copies input fields to output fields with the same name. Use Wildcard (.*) Rule — Generates one rule that copies input fields to output fields with the same name.
)If it is not already displayed, display the Transform Editor Grid.
2)Click the Business Rules tab if it is not already displayed.
3)Select Edit > Add Default Rules.
In case of reformat if the destination field names are same or subset of the source fields then no need to write anything in the reformat xfr unless you dont want to use any real transform other than reducing the set of fields or split the flow into a number of flows to achive the functionality.
What is the difference between partitioning with key and round robin?
Partition by Key or hash partition -> This is a partitioning technique which is used to partition data when the keys are diverse. If the key is present in large volume then there can large data skew. But this method is used more often for parallel data processing.
Round robin partition is another partitioning technique to uniformly distribute the data on each of the destination data partitions. The skew is zero in this case when no of records is divisible by number of partitions. A real life example is how a pack of 52 cards is distributed among 4 players in a round-robin manner.
How do you improve the performance of a graph?
There are many ways the performance of the graph can be improved.
1) Use a limited number of components in a particular phase
2) Use optimum value of max core values for sort and join components
3) Minimise the number of sort components
4) Minimise sorted join component and if possible replace them by in-memory join/hash join
5) Use only required fields in the sort, reformat, join components
6) Use phasing/flow buffers in case of merge, sorted joins
7) If the two inputs are huge then use sorted join, otherwise use hash join with proper driving port
8) For large dataset don't use broadcast as partitioner
9) Minimise the use of regular expression functions like re_index in the trasfer functions
10) Avoid repartitioning of data unnecessarily
Try to run the graph as long as possible in MFS. For these input files should be partitioned and if possible output file should also be partitioned.
How do you truncate a table?
From Abinitio run sql component using the DDL "trucate table
By using the Truncate table component in Ab Initio
Have you eveer encountered an error called "depth not equal"?
When two components are linked together if their layout doesnot match then this problem can occur during the compilation of the graph. A solution to this problem would be to use a partitioning component in between if there was change in layout.
What is the function you would use to transfer a string into a decimal?
In this case no specific function is required if the size of the string and decimal is same. Just use decimal cast with the size in the transform function and will suffice. For example, if the source field is defined as string(8) and the destination as decimal(8) then (say the field name is field1).
out.field :: (decimal(8)) in.field
If the destination field size is lesser than the input then use of string_substring function can be used likie the following.
say destination field is decimal(5).
out.field :: (decimal(5))string_lrtrim(string_substring(in.field,1,5)) /* string_lrtrim used to trim leading and trailing spaces */
What are primary keys and foreign keys?
Ab Initio Introduction
Ab Initio means “ Starts From the Beginning”. Ab-Initio software works with the client-server model.
The client is called “Graphical Development Environment” (you can call it GDE).It
resides on user desktop.The server or back-end is called Co-Operating System”. The Co-Operating System can reside in a mainframe or unix remote machine.
The Ab-Initio code is called graph ,which has got .mp extension. The graph from GDE is required to be deployed in corresponding .ksh version. In Co-Operating system the
corresponding .ksh in run to do the required job.
How Ab-Initio Job Is Run What happens when you push the “Run” button?
•Your graph is translated into a script that can be executed in the Shell Development
•This script and any metadata files stored on the GDE client machine are shipped (via
FTP) to the server.
•The script is invoked (via REXEC or TELNET) on the server.
•The script creates and runs a job that may run across many hosts.
•Monitoring information is sent back to the GDE client.
Ab-Intio Environment The advantage of Ab-Initio code is that it can run in both the serial and multi-file system environment. Serial Environment: The normal UNIX file system. Muti-File System: Multi-File System (mfs) is meant for parallelism. In an mfs a particular file physically stored across different partition of the machine or even different
machine but pointed by a logical file, which is stored in the co-operating system. The
logical file is the control file which holds the pointer to the physical locations.
About Ab-Initio Graphs: An Ab-Initio graph comprises number of components to serve different purpose. Data is read or write by a component according to the dml ( do not
confuse with the database “data manipulating language” The most commonly used
components are described in the following sections.
Co>Operating System
Co>Operating System is a program provided by AbInitio which operates on the top of the operating system and is a base for all AbInitio processes. It provdes additional features known as air commands which can be installed on a variety of system environments such as Unix, HP-UX, Linux, IBM AIX, Windows systems. The AbInitio CoOperating System provides the following features:
- Manage and run AbInitio graphs and control the ETL processes
- Provides AbInitio extensions to the operating system
- ETL processes monitoring and debugging
- Metadata management and interaction with the EME
AbInitio GDE (Graphical Development Enviroment)
GDE is a graphical application for developers which is used for designing and running AbInitio graphs. It also provides:
- The ETL process in AbInitio is represented by AbInitio graphs. Graphs are formed by components (from the standard components library or custom), flows (data streams) and parameters.
- A user-friendly frontend for designing Ab Initio ETL graphs
- Ability to run, debug Ab Initio jobs and trace execution logs
- GDE AbInitio graph compilation process results in generation of a UNIX shell script which may be executed on a machine without the GDE installed
AbInitio EME
Enterprise Meta>Environment (EME) is an AbInitio repository and environment for storing and managing metadata. It provides capability to store both business and technical metadata. EME metadata can be accessed from the Ab Initio GDE, web browser or AbInitio CoOperating system command line (air commands)
Conduct>It
Conduct It is an environment for creating enterprise Ab Initio data integration systems. Its main role is to create AbInitio Plans which is a special type of graph constructed of another graphs and scripts. AbInitio provides both graphical and command-line interface to Conduct>IT.
Data Profiler
The Data Profiler is an analytical application that can specify data range, scope, distribution, variance, and quality. It runs in a graphic environment on top of the Co>Operating system.
Component Library
The Ab Initio Component Library is a reusable software module for sorting, data transformation, and high-speed database loading and unloading. This is a flexible and extensible tool which adapts at runtime to the formats of records entered and allows creation and incorporation of new components obtained from any program that permits integration and reuse of external legacy codes and storage engines.
The client is called “Graphical Development Environment” (you can call it GDE).It
resides on user desktop.The server or back-end is called Co-Operating System”. The Co-Operating System can reside in a mainframe or unix remote machine.
The Ab-Initio code is called graph ,which has got .mp extension. The graph from GDE is required to be deployed in corresponding .ksh version. In Co-Operating system the
corresponding .ksh in run to do the required job.
How Ab-Initio Job Is Run What happens when you push the “Run” button?
•Your graph is translated into a script that can be executed in the Shell Development
•This script and any metadata files stored on the GDE client machine are shipped (via
FTP) to the server.
•The script is invoked (via REXEC or TELNET) on the server.
•The script creates and runs a job that may run across many hosts.
•Monitoring information is sent back to the GDE client.
Ab-Intio Environment The advantage of Ab-Initio code is that it can run in both the serial and multi-file system environment. Serial Environment: The normal UNIX file system. Muti-File System: Multi-File System (mfs) is meant for parallelism. In an mfs a particular file physically stored across different partition of the machine or even different
machine but pointed by a logical file, which is stored in the co-operating system. The
logical file is the control file which holds the pointer to the physical locations.
About Ab-Initio Graphs: An Ab-Initio graph comprises number of components to serve different purpose. Data is read or write by a component according to the dml ( do not
confuse with the database “data manipulating language” The most commonly used
components are described in the following sections.
Co>Operating System
Co>Operating System is a program provided by AbInitio which operates on the top of the operating system and is a base for all AbInitio processes. It provdes additional features known as air commands which can be installed on a variety of system environments such as Unix, HP-UX, Linux, IBM AIX, Windows systems. The AbInitio CoOperating System provides the following features:
- Manage and run AbInitio graphs and control the ETL processes
- Provides AbInitio extensions to the operating system
- ETL processes monitoring and debugging
- Metadata management and interaction with the EME
AbInitio GDE (Graphical Development Enviroment)
GDE is a graphical application for developers which is used for designing and running AbInitio graphs. It also provides:
- The ETL process in AbInitio is represented by AbInitio graphs. Graphs are formed by components (from the standard components library or custom), flows (data streams) and parameters.
- A user-friendly frontend for designing Ab Initio ETL graphs
- Ability to run, debug Ab Initio jobs and trace execution logs
- GDE AbInitio graph compilation process results in generation of a UNIX shell script which may be executed on a machine without the GDE installed
AbInitio EME
Enterprise Meta>Environment (EME) is an AbInitio repository and environment for storing and managing metadata. It provides capability to store both business and technical metadata. EME metadata can be accessed from the Ab Initio GDE, web browser or AbInitio CoOperating system command line (air commands)
Conduct>It
Conduct It is an environment for creating enterprise Ab Initio data integration systems. Its main role is to create AbInitio Plans which is a special type of graph constructed of another graphs and scripts. AbInitio provides both graphical and command-line interface to Conduct>IT.
Data Profiler
The Data Profiler is an analytical application that can specify data range, scope, distribution, variance, and quality. It runs in a graphic environment on top of the Co>Operating system.
Component Library
The Ab Initio Component Library is a reusable software module for sorting, data transformation, and high-speed database loading and unloading. This is a flexible and extensible tool which adapts at runtime to the formats of records entered and allows creation and incorporation of new components obtained from any program that permits integration and reuse of external legacy codes and storage engines.
Tuesday, August 16, 2011
10 Things to Do Before Starting a Business
Deciding to start a business can be an exciting and terrifying time all at once. You can alleviate your fear by being well prepared and not rushing into anything. There are seven steps all soon-to-be business owners should complete before that all important opening day
1. Scope out your industry.
Or, if you're just starting to think about entrepreneurship in general, find the best industry to fit your style and talents. For example, this year's burgeoning industries include interactive technology (from mobile app design to tech-savvy translation), wellness (healthy beverages), and little luxuries, such as baked goods. When you start honing in on a specialty area, seek out counselors and talk to industry veterans. You can go to SCORE, the SBA, the Women's Economic Development Agency, or scores more. The Internet, your local library, the U.S. Census Bureau, business schools, industry associations, can be invaluable sources of information and contacts. For instance, you might approach business schools in your area to see if one of their marketing classes will take on your business as a test project. You could potentially get some valuable market research results at no cost.
2. Size-up the competition.
Study your competition by visiting stores or locations where their products are offered. Say you want to open a new restaurant. For starters, create a list of restaurants in the area. Look at the menus, pricing, and additional features (e.g., valet parking or late night bar). Then check out the diners those restaurants appeal to. Are they young college students, neighborhood employees, or families? Then, become a customer of the competition. Go into stealth mode by visiting its website and putting yourself on its e-mail list. Read articles written on them. Sign up for e-mail alerts about search terms of your choice on Google News, which tracks hundreds of news sources. After you study it, deconstruct it using Fagan Finder, a bare-bones but very useful research site. Plug the address into the search box. You will be able to quickly learn, for example, the other sites that link to it, which can reveal alliances, networks, suppliers, and customers. Business data aggregators such as Dun & Bradstreet and InfoUSA provide detailed company information, including financials, although the services are not cheap. Your aim is to understand what your competition is doing so you can do it better.
3. Second-guess yourself.
"The biggest mistake I see these days is thinking that a business idea will automatically turn into a viable business model," says Terri Lonier, president and founder of Working Solo, a New Paltz, New York-based business strategy consultancy, and author of Working Solo: The Real Guide to Freedom and Financial Success with Your Own Business. Then again, what if the idea really is viable? "A lot of people start with a kitchen table idea," says Marla Tabaka, a business coach who writes The Successful Soloist blog for Inc.com. "It's a great idea you come up with your cousin at dinner. But then the business booms, and your growth gets out of control. You need a plan." Another important consideration is your personal financial resources. Make sure you have a considerable amount of capital set aside, especially because in a sole proprietorship you assume personal liability for all activities of that business. If you borrow money and can’t repay it, your personal assets are at stake.
4. Think about funding. A lot.
Can you bootstrap your company? Or are you going to need a small business loan? Might an entrepreneur in the family be able to invest, or should you look for venture capital or an angel investor? Money is a big topic for entrepreneurs, and you'll want to know your options early on. In order to get investors to open up their checkbooks, you’ll need to convince them that your idea is worthy and also be willing to subject yourself to increased scrutiny and give up a percentage of your company. That’s why it’s a good idea to first ask yourself whether you really need a professional investor at all, says David Henkel-Wallace, a serial entrepreneur who has raised $60 million from VCs. "If you’re starting a web software or mobile software company, you might be able to bootstrap it, which has the advantage that you get to keep all the money you earn," says Henkel-Wallace. "You could also look into borrowing from friends and family – or even take out a second mortgage – for the same reason." If you decide your business can only get to the next level with the aid of a professional investor, then you need to figure out what a potential backer looks for in a budding company, says Martin Babinec, who raised six rounds of funding through the business process outsourcing firm he founded, TriNet, which now boasts annual revenues in excess of $200 million. Start doing your research now, and don't talk to investors until you have a strategy that involves foreseeable future liquidity
5. Refine your concept.
Adrienne Simpson initially intended to run a traditional moving company out of her home in October 2002. The idea came to her after relocating her mother from Georgia to Michigan. "I thought I'd put everything in a box, put it on a truck and send her on her way. Oh, no! Mom started walking me through her home, pointing at things saying, 'I'll take that, let's sell that, and I want to give that away,'" she recalls. By the second year of operation, Simpson shifted gears to make her Stone Mountain, Georgia-based company, Smooth Mooove, specialize in transporting seniors—and their beloved pets—and providing such value-add services as packaging, house cleaning, room reassembly, antique appraisals, estate sales, and charity donations. Her crew does everything: put clothes in the closets, hang drapes, make the bed, fill the refrigerator. But even still business was stalling. "I knew how to run an existing company, but I didn't know how to run a start-up," says Simpson, who worked 20 years for Blue Cross/Blue Shield and 10 years with Cigna Healthcare. Seeking money and marketing advice, Simpson went to the U.S. Small Business Administration (SBA) office in Atlanta and was connected to SCORE (Service Corps of Retired Executives) counselor Jeff Mesquita. "When you position your company you have to think outside of the box in terms of what makes you different from the competition," says Mesquita. "Adrienne described that what she does is move seniors from A to Z, so, when they arrive to their new home it is like walking into a hotel room." The only thing her clients have to bring is the clothes on their back (and maybe their pet under their arm). That's when Mesquita suggested the business name change to Smooth Mooove Senior Relocation Services. That same night, Simpson went to a networking event. When people asked 'what do you do?' and her response was 'I have a senior relocation service.' Right away people said 'Oh, you move seniors." The business took off from there.
6. Seek advise from friends, mentors … or anyone, really.
A mentor can be a boon to an entrepreneur in a broad range of scenarios, whether he or she provides pointers on business strategy, helps you bolster your networking efforts, or act as confidantes when your work-life balance gets out of whack. But the first thing you need to know when seeking out a mentor is what you’re looking for from the arrangement. What can your mentor do for you? Determining what type of resource you need is a crucial first step in the mentor hunt. Lois Zachary, the president of Leadership Development Services, a Phoenix, Arizona-based business coaching firm, and author of The Mentee’s Guide: Making Mentoring Work for You, recommends starting with a list. You may want someone who’s a good listener, someone well connected, someone with expertise in, say, marketing, someone accessible. Ideally you could find a mentor with all of these qualities, but the reality is you may have to make some compromises. After you enumerate the qualities you’re looking for in a mentor, divide that list into wants and needs. Who's best as a mentor? Look within your family, friends, business community, academic community, and even at your competitors – well, not your direct competition, but you get the idea..
7. Pick a name.
Naming your business can be a stressful process. You want to choose a name that will last and, if possible, will embody both your values and your company’s distinguishing characteristics. But screening long lists of names with a focus group composed of friends and family can return mixed results. Alternatively, a naming firm will ask questions to learn more about your culture and what's unique about you - things you'll want to communicate to consumers. One thing that Phillip Davis, the founder of Tungsten Branding, a Brevard, North Carolina-based naming firm, asks entrepreneurs is "do you want to fit in or stand out?" It seems straightforward. Who wouldn't want to stand out? But Davis explains that some businesses are so concerned about gaining credibility in their field, often those in financial services or consulting, that they will sacrifice an edgy or attention-getting name. "However, in the majority of cases, clients want to stand out and that's a better approach when looking at your long-term goals. Even the companies that say 'I just want to get my foot in the door' will usually begin wishing that they stood out more once they pass that first hurdle."
8. Get a grasp on marketing strategies.
You don't need to be a marketing whiz, but if you’re trying to build an idea from the ground-up, you'll likely need to build an accompanying marketing strategy from the ground up. In doing so, you need to be clear on who your customers are, because you don’t have any time to waste on marketing to those who aren’t. "That’s really the biggest challenge, determining who exactly your customers are," Lonier says. "Many times think they understand who they are, but you need to be willing to interview and test potential customers, particularly in the early days of a company, in order to be able to build those relationships." One way to make marketing easier is through joint-venture marketing, Tabaka says. When she owned a coffeehouse in Naperville, Illinois, she realized that her company and a major drugstore in the same shopping center could work together and support each other’s marketing goals. Another important and relatively easy way to get your name out into the market is building your web presence through social media like Twitter and Facebook. Be sure you familiarize yourself with and utilize Search Engine Optimization (SEO) to make it easier for people to find your website.
9. Do a little test-run.
"The best way to test your idea is if you're employed full-time and can sell your product or service in the marketplace on weekends," says Sapp. If the business is already your day job, then you have to move quickly to test, verify, and tweak your model," he adds. Try surveys, polls, and focus groups to gain insight into attitudes about your business idea. Solicit feedback on the cheap by using online survey tools available through such services as Zoomerang.com, Surveymonkey.com, and Constantcontact.com. The goal is to get to know your customers intimately. What turns them on? What causes them to tune out? Are they impulse buyers or do they like to deliberate over their buying decisions? There are a lot of products that people like but don't buy, says Sapp. The price might not be right, for example. "Use social media to hone in on certain groups that can become your focus group," says Susan Friedmann, a nichepreneur coach, in Lake Placid, New York and author of Riches in Niches: How to Make it Big in a Small Market. "Check out chat rooms, communities on social networks like Ning or Facebook, industry groups within LinkedIn," she says. "What are people discussing? Letters to the editor or articles in trade publications are resources for finding out about challenges in that particular industry. What are people writing about? What do people want to know about?" Knowing the answers to these types of questions may help you refine your idea.
10. Start searching for future talent.
This might sound premature, but don't forget that your business is supposed to grow someday. Keep your eyes peeled all the time for people who might fit into your organization – even if you can't afford to pay them yet. No matter how small the internet has made the world, experts still recommend in-person networking as the No. 1 way to recruit talent. "I've done a lot of placing people into positions, and I have never used a job board as a way to do that," says Rich Sloan, co-founder of StartupNation. 'Personal is so much more powerful and important to me." So, if you meet someone interesting or knowledgable at a networking event, or even if you get particularly impressive service somewhere, be it a museum gift shop or helpline, ask that person a bit about themselves, what kind of business they see themselves in in five years – and the best people around will stick in your mind for when you need them
Finally opening day! It may be exciting, nerve-wracking, and scary as heck but enjoy the ride...today is the first day of your success.
1. Scope out your industry.
Or, if you're just starting to think about entrepreneurship in general, find the best industry to fit your style and talents. For example, this year's burgeoning industries include interactive technology (from mobile app design to tech-savvy translation), wellness (healthy beverages), and little luxuries, such as baked goods. When you start honing in on a specialty area, seek out counselors and talk to industry veterans. You can go to SCORE, the SBA, the Women's Economic Development Agency, or scores more. The Internet, your local library, the U.S. Census Bureau, business schools, industry associations, can be invaluable sources of information and contacts. For instance, you might approach business schools in your area to see if one of their marketing classes will take on your business as a test project. You could potentially get some valuable market research results at no cost.
2. Size-up the competition.
Study your competition by visiting stores or locations where their products are offered. Say you want to open a new restaurant. For starters, create a list of restaurants in the area. Look at the menus, pricing, and additional features (e.g., valet parking or late night bar). Then check out the diners those restaurants appeal to. Are they young college students, neighborhood employees, or families? Then, become a customer of the competition. Go into stealth mode by visiting its website and putting yourself on its e-mail list. Read articles written on them. Sign up for e-mail alerts about search terms of your choice on Google News, which tracks hundreds of news sources. After you study it, deconstruct it using Fagan Finder, a bare-bones but very useful research site. Plug the address into the search box. You will be able to quickly learn, for example, the other sites that link to it, which can reveal alliances, networks, suppliers, and customers. Business data aggregators such as Dun & Bradstreet and InfoUSA provide detailed company information, including financials, although the services are not cheap. Your aim is to understand what your competition is doing so you can do it better.
3. Second-guess yourself.
"The biggest mistake I see these days is thinking that a business idea will automatically turn into a viable business model," says Terri Lonier, president and founder of Working Solo, a New Paltz, New York-based business strategy consultancy, and author of Working Solo: The Real Guide to Freedom and Financial Success with Your Own Business. Then again, what if the idea really is viable? "A lot of people start with a kitchen table idea," says Marla Tabaka, a business coach who writes The Successful Soloist blog for Inc.com. "It's a great idea you come up with your cousin at dinner. But then the business booms, and your growth gets out of control. You need a plan." Another important consideration is your personal financial resources. Make sure you have a considerable amount of capital set aside, especially because in a sole proprietorship you assume personal liability for all activities of that business. If you borrow money and can’t repay it, your personal assets are at stake.
4. Think about funding. A lot.
Can you bootstrap your company? Or are you going to need a small business loan? Might an entrepreneur in the family be able to invest, or should you look for venture capital or an angel investor? Money is a big topic for entrepreneurs, and you'll want to know your options early on. In order to get investors to open up their checkbooks, you’ll need to convince them that your idea is worthy and also be willing to subject yourself to increased scrutiny and give up a percentage of your company. That’s why it’s a good idea to first ask yourself whether you really need a professional investor at all, says David Henkel-Wallace, a serial entrepreneur who has raised $60 million from VCs. "If you’re starting a web software or mobile software company, you might be able to bootstrap it, which has the advantage that you get to keep all the money you earn," says Henkel-Wallace. "You could also look into borrowing from friends and family – or even take out a second mortgage – for the same reason." If you decide your business can only get to the next level with the aid of a professional investor, then you need to figure out what a potential backer looks for in a budding company, says Martin Babinec, who raised six rounds of funding through the business process outsourcing firm he founded, TriNet, which now boasts annual revenues in excess of $200 million. Start doing your research now, and don't talk to investors until you have a strategy that involves foreseeable future liquidity
5. Refine your concept.
Adrienne Simpson initially intended to run a traditional moving company out of her home in October 2002. The idea came to her after relocating her mother from Georgia to Michigan. "I thought I'd put everything in a box, put it on a truck and send her on her way. Oh, no! Mom started walking me through her home, pointing at things saying, 'I'll take that, let's sell that, and I want to give that away,'" she recalls. By the second year of operation, Simpson shifted gears to make her Stone Mountain, Georgia-based company, Smooth Mooove, specialize in transporting seniors—and their beloved pets—and providing such value-add services as packaging, house cleaning, room reassembly, antique appraisals, estate sales, and charity donations. Her crew does everything: put clothes in the closets, hang drapes, make the bed, fill the refrigerator. But even still business was stalling. "I knew how to run an existing company, but I didn't know how to run a start-up," says Simpson, who worked 20 years for Blue Cross/Blue Shield and 10 years with Cigna Healthcare. Seeking money and marketing advice, Simpson went to the U.S. Small Business Administration (SBA) office in Atlanta and was connected to SCORE (Service Corps of Retired Executives) counselor Jeff Mesquita. "When you position your company you have to think outside of the box in terms of what makes you different from the competition," says Mesquita. "Adrienne described that what she does is move seniors from A to Z, so, when they arrive to their new home it is like walking into a hotel room." The only thing her clients have to bring is the clothes on their back (and maybe their pet under their arm). That's when Mesquita suggested the business name change to Smooth Mooove Senior Relocation Services. That same night, Simpson went to a networking event. When people asked 'what do you do?' and her response was 'I have a senior relocation service.' Right away people said 'Oh, you move seniors." The business took off from there.
6. Seek advise from friends, mentors … or anyone, really.
A mentor can be a boon to an entrepreneur in a broad range of scenarios, whether he or she provides pointers on business strategy, helps you bolster your networking efforts, or act as confidantes when your work-life balance gets out of whack. But the first thing you need to know when seeking out a mentor is what you’re looking for from the arrangement. What can your mentor do for you? Determining what type of resource you need is a crucial first step in the mentor hunt. Lois Zachary, the president of Leadership Development Services, a Phoenix, Arizona-based business coaching firm, and author of The Mentee’s Guide: Making Mentoring Work for You, recommends starting with a list. You may want someone who’s a good listener, someone well connected, someone with expertise in, say, marketing, someone accessible. Ideally you could find a mentor with all of these qualities, but the reality is you may have to make some compromises. After you enumerate the qualities you’re looking for in a mentor, divide that list into wants and needs. Who's best as a mentor? Look within your family, friends, business community, academic community, and even at your competitors – well, not your direct competition, but you get the idea..
7. Pick a name.
Naming your business can be a stressful process. You want to choose a name that will last and, if possible, will embody both your values and your company’s distinguishing characteristics. But screening long lists of names with a focus group composed of friends and family can return mixed results. Alternatively, a naming firm will ask questions to learn more about your culture and what's unique about you - things you'll want to communicate to consumers. One thing that Phillip Davis, the founder of Tungsten Branding, a Brevard, North Carolina-based naming firm, asks entrepreneurs is "do you want to fit in or stand out?" It seems straightforward. Who wouldn't want to stand out? But Davis explains that some businesses are so concerned about gaining credibility in their field, often those in financial services or consulting, that they will sacrifice an edgy or attention-getting name. "However, in the majority of cases, clients want to stand out and that's a better approach when looking at your long-term goals. Even the companies that say 'I just want to get my foot in the door' will usually begin wishing that they stood out more once they pass that first hurdle."
8. Get a grasp on marketing strategies.
You don't need to be a marketing whiz, but if you’re trying to build an idea from the ground-up, you'll likely need to build an accompanying marketing strategy from the ground up. In doing so, you need to be clear on who your customers are, because you don’t have any time to waste on marketing to those who aren’t. "That’s really the biggest challenge, determining who exactly your customers are," Lonier says. "Many times think they understand who they are, but you need to be willing to interview and test potential customers, particularly in the early days of a company, in order to be able to build those relationships." One way to make marketing easier is through joint-venture marketing, Tabaka says. When she owned a coffeehouse in Naperville, Illinois, she realized that her company and a major drugstore in the same shopping center could work together and support each other’s marketing goals. Another important and relatively easy way to get your name out into the market is building your web presence through social media like Twitter and Facebook. Be sure you familiarize yourself with and utilize Search Engine Optimization (SEO) to make it easier for people to find your website.
9. Do a little test-run.
"The best way to test your idea is if you're employed full-time and can sell your product or service in the marketplace on weekends," says Sapp. If the business is already your day job, then you have to move quickly to test, verify, and tweak your model," he adds. Try surveys, polls, and focus groups to gain insight into attitudes about your business idea. Solicit feedback on the cheap by using online survey tools available through such services as Zoomerang.com, Surveymonkey.com, and Constantcontact.com. The goal is to get to know your customers intimately. What turns them on? What causes them to tune out? Are they impulse buyers or do they like to deliberate over their buying decisions? There are a lot of products that people like but don't buy, says Sapp. The price might not be right, for example. "Use social media to hone in on certain groups that can become your focus group," says Susan Friedmann, a nichepreneur coach, in Lake Placid, New York and author of Riches in Niches: How to Make it Big in a Small Market. "Check out chat rooms, communities on social networks like Ning or Facebook, industry groups within LinkedIn," she says. "What are people discussing? Letters to the editor or articles in trade publications are resources for finding out about challenges in that particular industry. What are people writing about? What do people want to know about?" Knowing the answers to these types of questions may help you refine your idea.
10. Start searching for future talent.
This might sound premature, but don't forget that your business is supposed to grow someday. Keep your eyes peeled all the time for people who might fit into your organization – even if you can't afford to pay them yet. No matter how small the internet has made the world, experts still recommend in-person networking as the No. 1 way to recruit talent. "I've done a lot of placing people into positions, and I have never used a job board as a way to do that," says Rich Sloan, co-founder of StartupNation. 'Personal is so much more powerful and important to me." So, if you meet someone interesting or knowledgable at a networking event, or even if you get particularly impressive service somewhere, be it a museum gift shop or helpline, ask that person a bit about themselves, what kind of business they see themselves in in five years – and the best people around will stick in your mind for when you need them
Finally opening day! It may be exciting, nerve-wracking, and scary as heck but enjoy the ride...today is the first day of your success.
Monday, October 19, 2009
PMP- PDUs
Your weeks’ worth of hard work paid off and you cleared PMP examination. Congratulatory emails pour in from friends and colleagues, you're happy with yourself for a few days - and things return to normal. For a few people, a job hop happens soon after the certification, honeymoon period with the new company happens and ends - and then things return to normal. What next?
All certification programs require their participants to keep abreast of the latest happenings in their respective fields - so does PMI. Here is the link for PMI's continuing certification requirements. PMI requires you to collect 60 professional development units (PDUs) in 3 years and then shell out a renewal fee ($60 for members, $150 for non-members) to stay in good standing on your PMP certification.
There are various activities that allow you to accumulate PDUs and the simplest of them is doing your job as a Project manager. For that, you can lay claim to 5 PDUs per year or a total of 15 PDUs per 3 year cycle. So really it boils down to getting 45 PDUs in 3 years to stay PMP certified. It is a lot of time, but believe me - if you don't plan well you'll end up thinking 'that's a lot of PDUs' in the last year. To avoid last year scrambling, it is better to participate in PDU gleaning activities as soon as possible.
Contrary to what many money-making, PDU-giving websites out there would like you to believe, it is possible to get to that magical 45 without much burden on your pocket. Unless you have a willing sponsor in your company for outrageous hundreds of dollars for tens of PDUs, you should look at some of the options below.
1. Manage projects and keep track of your work: As I mentioned earlier, this is the easiest of task of all. 5 PDUs per year, 15 total in Category 2H. Burden on your wallet is zilch; in fact you make money to do this stuff, don't you?
2. Free webinars: These come under Category 3 that’s for courses handled by PMI registered education providers. International Institute of Learning is one of them and regularly schedules free webinars. The last time I checked, they had twelve 1 -hour sessions each worth 1 PDU. That’s 12 PDUs. This is free too, I have attended 3 of their sessions so far, and think they are very informative.
3. Free podcasts: Under category 2-SDL, you can claim a maximum of 15 PDUs. Free 1 hour podcasts from PMPodcast give you 1 PDU each, and there is over 50 hours of audio material available. You can either download these episodes to your iTunes library or listen to them directly from the website. Again, you don’t spend a dime from your pocket; but end up learning a lot.
4. Volunteer work: A maximum of 20 PDUs can be gathered by working in your local PMI chapter or any recognized Project management organization (non-employer) - 10 PDUs per year if you work as an elected member of the chapter and 5 PDUs per year as a volunteer member of the chapter. You can get 5 PDUs per year if you do volunteer work for any legally recognized charitable organization. This activity involves outlay from you for membership fee, traveling expenses etc. but the satisfaction you get out of the volunteer work might offset any feel of pinch to your pocket. Also, monetarily this might be a cheaper option than paying for PDU courses.
Doing the above 4 should get to the magical 60 mark.
If you’re not willing to do volunteer work, then the other option is to look for economical (but not free) options available on the Internet for PDUs. You’ll see some expensive deals and some good ones. Folks who have taken PMP exam based on PMBOK 2000 can look up online courses offered by PMStudy.com or PMCampus.com. PMStudy.com offers 40PDUs for $80 while PMCampus offers 25PDUs for $95. Based on how many PDUs you require, you can choose one of those courses. After you pay up, you’ll get access to their online tests. Get a copy of PMBOK third edition and review it before taking the examinations. The tests will be available for 90 days after registration; though the exams are time-bound, there is no limit on the number of retries to get to the qualifying mark. Once completed, these courses give you the PDU information and a certificate to print.
If you get more than 60 PDUs, note that you have an option to carry over up to 20 PDUs into the next certification cycle – but this applies to only the additional ones accumulated during the 3rd year. Remember to keep record of all your PDU gathering activities, they will come handy if PMI chooses to audit your submissions.
All certification programs require their participants to keep abreast of the latest happenings in their respective fields - so does PMI. Here is the link for PMI's continuing certification requirements. PMI requires you to collect 60 professional development units (PDUs) in 3 years and then shell out a renewal fee ($60 for members, $150 for non-members) to stay in good standing on your PMP certification.
There are various activities that allow you to accumulate PDUs and the simplest of them is doing your job as a Project manager. For that, you can lay claim to 5 PDUs per year or a total of 15 PDUs per 3 year cycle. So really it boils down to getting 45 PDUs in 3 years to stay PMP certified. It is a lot of time, but believe me - if you don't plan well you'll end up thinking 'that's a lot of PDUs' in the last year. To avoid last year scrambling, it is better to participate in PDU gleaning activities as soon as possible.
Contrary to what many money-making, PDU-giving websites out there would like you to believe, it is possible to get to that magical 45 without much burden on your pocket. Unless you have a willing sponsor in your company for outrageous hundreds of dollars for tens of PDUs, you should look at some of the options below.
1. Manage projects and keep track of your work: As I mentioned earlier, this is the easiest of task of all. 5 PDUs per year, 15 total in Category 2H. Burden on your wallet is zilch; in fact you make money to do this stuff, don't you?
2. Free webinars: These come under Category 3 that’s for courses handled by PMI registered education providers. International Institute of Learning is one of them and regularly schedules free webinars. The last time I checked, they had twelve 1 -hour sessions each worth 1 PDU. That’s 12 PDUs. This is free too, I have attended 3 of their sessions so far, and think they are very informative.
3. Free podcasts: Under category 2-SDL, you can claim a maximum of 15 PDUs. Free 1 hour podcasts from PMPodcast give you 1 PDU each, and there is over 50 hours of audio material available. You can either download these episodes to your iTunes library or listen to them directly from the website. Again, you don’t spend a dime from your pocket; but end up learning a lot.
4. Volunteer work: A maximum of 20 PDUs can be gathered by working in your local PMI chapter or any recognized Project management organization (non-employer) - 10 PDUs per year if you work as an elected member of the chapter and 5 PDUs per year as a volunteer member of the chapter. You can get 5 PDUs per year if you do volunteer work for any legally recognized charitable organization. This activity involves outlay from you for membership fee, traveling expenses etc. but the satisfaction you get out of the volunteer work might offset any feel of pinch to your pocket. Also, monetarily this might be a cheaper option than paying for PDU courses.
Doing the above 4 should get to the magical 60 mark.
If you’re not willing to do volunteer work, then the other option is to look for economical (but not free) options available on the Internet for PDUs. You’ll see some expensive deals and some good ones. Folks who have taken PMP exam based on PMBOK 2000 can look up online courses offered by PMStudy.com or PMCampus.com. PMStudy.com offers 40PDUs for $80 while PMCampus offers 25PDUs for $95. Based on how many PDUs you require, you can choose one of those courses. After you pay up, you’ll get access to their online tests. Get a copy of PMBOK third edition and review it before taking the examinations. The tests will be available for 90 days after registration; though the exams are time-bound, there is no limit on the number of retries to get to the qualifying mark. Once completed, these courses give you the PDU information and a certificate to print.
If you get more than 60 PDUs, note that you have an option to carry over up to 20 PDUs into the next certification cycle – but this applies to only the additional ones accumulated during the 3rd year. Remember to keep record of all your PDU gathering activities, they will come handy if PMI chooses to audit your submissions.
A Network Approach to Gloabl Product Management
Having worked in a product-based multinational company from the united states and from india, i have come to experience first-hand the practicalities of the globalization process. in many ways, it’s like a long-distance relationship : you have to keep nurturing it, else it falls apart before you know it. here are a few thoughts to consider -
1. It’s a network: recognize that globalization isn’t just a means to ‘go where the talent is’, or ‘where the customers are’. it’s about decentralizing the organization such that the teams form a network of interdependent parts, rather than hub-and-spoke setup. true global organizations recognize that the animal must have multiple heads connected to a body, and not multiple tentacles connected to a head.
2. Leading with competency: each of the org units must have a clear charter of competency. setting up an office for sustained performance and results is hinged upon what that office is expected to deliver on a sustained long-term basis. it is something that feeds the growth of the organization, guides the day-to-day decisions, and motivates folks on a consistent basis. if you don’t define competency for each office, it’s merely a network of hired guns with no higher purpose. always ask this question of each org unit – “why do you exist?”. if the answer is more than 2 sentences, you probably don’t have it right.
3. Build mutual trust: the network of organizational nodes must understand that the success of the whole depends on the success of each of the nodes in the network. each unit must trust that every other unit will deliver on their end with quality and timeliness that they can trust. build the team to deliver on a meaningful and mission-critical charter, and let them go. you will see amazing results provided you have selected the team carefully.
4. The safe route trap: the safe-route mentality is a big trap that must be avoided. companies find it hard to transition from a centralized model to a decentralized model, mainly because it requires some level of risk. it requires reshuffling and re-factoring of organizational alignments and a big bet on an unknown. companies end up taking the safe route of offshoring unimpactful things ‘just to test out how it could work’ or to ensure no major shake-ups are needed. thus starts the downward spiral of mediocre expectations leading to mediocre performance. it’s a self-fulfilling cycle – mediocre expectations attract mediocre talent which under performs, leading to even lower expectations.
5. Travel: like any long-distance relationship, frequent travel and face-time is critical to establishing a person-to-person working relationship. for instance, same words have different meaning if they come from someone you know rather than from someone you’ve never met, simple as that. people often kickoff a transition using a week-long ‘transfer-of-information’ sessions and expect that henceforth, everything would be nice-and-dandy. during the transition period, people often attach certain metal picture and adjectives to each other (e.g. ‘he is talkative’ or ’she is a geek’), and those tags last far longer after the memory of the person has faded. there starts the trouble, when the pre-determined adjectives drive one’s picture of the other. as such, in an interdependent organization, frequent renewal of the working relationship is extremely crucial.
6. Burnout: for coordinated projects across continents, people do burn-out taking night time calls. most of personal lives unfold in the evenings, and the rhythm of life is seriously disrupted even at two nights a week. it’s worse if some of the folks on the call are at a significantly higher level of discomfort than others – they simply don’t share the sense of urgency to keep the calls to-the-point. these calls are unavoidable, so it’s important to have a structure for maximizing the productivity.
7. Asynchronous communication: set up proper message boards, intranets, doc shares, workspaces, or whatever makes the asynchronous communication a bit easier. invest in documenting everything *before* the plans are executed.
8. Establish redundency: have a good bench strength. have a farm system to develop the talent required to sustain the competency.
9. Dual echo chambers: the emails and words on phone don’t convey the difference between someone meaning ‘dude you are smoking something man’ from ‘i don’t think so’ from ‘i didn’t think so’. aside from that, the hallway conversations and sidebars in meetings amplify completely different parts of spectrum of signals that the business continuously emits, and those differences don’t come out until its too late. when they do, it’s in a charged up environment, to the detriment of the entire business.
1. It’s a network: recognize that globalization isn’t just a means to ‘go where the talent is’, or ‘where the customers are’. it’s about decentralizing the organization such that the teams form a network of interdependent parts, rather than hub-and-spoke setup. true global organizations recognize that the animal must have multiple heads connected to a body, and not multiple tentacles connected to a head.
2. Leading with competency: each of the org units must have a clear charter of competency. setting up an office for sustained performance and results is hinged upon what that office is expected to deliver on a sustained long-term basis. it is something that feeds the growth of the organization, guides the day-to-day decisions, and motivates folks on a consistent basis. if you don’t define competency for each office, it’s merely a network of hired guns with no higher purpose. always ask this question of each org unit – “why do you exist?”. if the answer is more than 2 sentences, you probably don’t have it right.
3. Build mutual trust: the network of organizational nodes must understand that the success of the whole depends on the success of each of the nodes in the network. each unit must trust that every other unit will deliver on their end with quality and timeliness that they can trust. build the team to deliver on a meaningful and mission-critical charter, and let them go. you will see amazing results provided you have selected the team carefully.
4. The safe route trap: the safe-route mentality is a big trap that must be avoided. companies find it hard to transition from a centralized model to a decentralized model, mainly because it requires some level of risk. it requires reshuffling and re-factoring of organizational alignments and a big bet on an unknown. companies end up taking the safe route of offshoring unimpactful things ‘just to test out how it could work’ or to ensure no major shake-ups are needed. thus starts the downward spiral of mediocre expectations leading to mediocre performance. it’s a self-fulfilling cycle – mediocre expectations attract mediocre talent which under performs, leading to even lower expectations.
5. Travel: like any long-distance relationship, frequent travel and face-time is critical to establishing a person-to-person working relationship. for instance, same words have different meaning if they come from someone you know rather than from someone you’ve never met, simple as that. people often kickoff a transition using a week-long ‘transfer-of-information’ sessions and expect that henceforth, everything would be nice-and-dandy. during the transition period, people often attach certain metal picture and adjectives to each other (e.g. ‘he is talkative’ or ’she is a geek’), and those tags last far longer after the memory of the person has faded. there starts the trouble, when the pre-determined adjectives drive one’s picture of the other. as such, in an interdependent organization, frequent renewal of the working relationship is extremely crucial.
6. Burnout: for coordinated projects across continents, people do burn-out taking night time calls. most of personal lives unfold in the evenings, and the rhythm of life is seriously disrupted even at two nights a week. it’s worse if some of the folks on the call are at a significantly higher level of discomfort than others – they simply don’t share the sense of urgency to keep the calls to-the-point. these calls are unavoidable, so it’s important to have a structure for maximizing the productivity.
7. Asynchronous communication: set up proper message boards, intranets, doc shares, workspaces, or whatever makes the asynchronous communication a bit easier. invest in documenting everything *before* the plans are executed.
8. Establish redundency: have a good bench strength. have a farm system to develop the talent required to sustain the competency.
9. Dual echo chambers: the emails and words on phone don’t convey the difference between someone meaning ‘dude you are smoking something man’ from ‘i don’t think so’ from ‘i didn’t think so’. aside from that, the hallway conversations and sidebars in meetings amplify completely different parts of spectrum of signals that the business continuously emits, and those differences don’t come out until its too late. when they do, it’s in a charged up environment, to the detriment of the entire business.
Subscribe to:
Comments (Atom)