dw FAQ

Embed Size (px)

Citation preview

  • 7/30/2019 dw FAQ

    1/77

    DATAWARE HOUSE CONCEPTS

    1) What is Data warehousing?

    Answer -A data warehouse can be considered as a storage area where interest specific or relevant data........

    2) What are fact tables and dimension tables?

    Answer -As mentioned, data in a warehouse comes from the transactions. Fact table in a data warehouse consists........

    3) What is ETL process in data warehousing?

    Answer - ETL is Extract Transform Load. It is a process of fetching.........

    4) Explain the difference between data mining and data warehousing.

    Answer - Data warehousing is merely extracting data from different sources, cleaning the........

    5) What is an OLTP system and OLAP system?

    Answer - OLTP: Online Transaction and Processing helps and manages applications based........

    6) What are cubes?

    Answer -A data cube stores data in a summarized version which helps in a faster analysis of data..........

    7) What is snow flake scheme design in database?

    Answer -A snowflake Schema in its simplest form is an arrangement of fact tables.........

    8) What is analysis service?

    Answer -Analysis service provides a combined view of the data used in OLAP.........

    9) Explain sequence clustering algorithm.

    Answer - Sequence clustering algorithm collects similar or related paths, sequences of data.........

    10) Explain discrete and continuous data in data mining.

    Answer - Discreet data can be considered as defined or finite data..........

    11) Explain time series algorithm in data mining.

    Answer - Time series algorithm can be used to predict continuous values of data.......

    12) What is XMLA?

    Answer - XMLA is XML for Analysis which can be considered as a standard for accessing data in OLAP.......

    13) Explain the difference between Data warehousing and Business Intelligence.

    Answer - Data Warehousing helps you store the data while business intelligence.....

    14) What is Dimensional Modeling?

    http://careerride.com/Data-warehousing-defined.aspxhttp://careerride.com/Data-warehousing-fact-dimension-tables.aspxhttp://careerride.com/Data-warehousing-ETL-process.aspxhttp://careerride.com/Data-warehousing-data-mining.aspxhttp://careerride.com/Data-warehousing-OLTP-OLAP.aspxhttp://careerride.com/Data-warehousing-cubes.aspxhttp://careerride.com/Data-warehousing-snow-flake.aspxhttp://careerride.com/Data-warehousing-analysis-service.aspxhttp://careerride.com/Data-warehousing-sequence-clustering.aspxhttp://careerride.com/Data-warehousing-sequence-clustering.aspxhttp://careerride.com/Data-warehousing-discrete-continuous.aspxhttp://careerride.com/Data-warehousing-discrete-continuous.aspxhttp://careerride.com/Data-warehousing-time-series-algorithm.aspxhttp://careerride.com/Data-warehousing-XMLA.aspxhttp://careerride.com/Data-warehousing-Business-Intelligence.aspxhttp://careerride.com/Data-warehousing-Dimensional-Modeling.aspxhttp://careerride.com/Data-warehousing-defined.aspxhttp://careerride.com/Data-warehousing-fact-dimension-tables.aspxhttp://careerride.com/Data-warehousing-ETL-process.aspxhttp://careerride.com/Data-warehousing-data-mining.aspxhttp://careerride.com/Data-warehousing-OLTP-OLAP.aspxhttp://careerride.com/Data-warehousing-cubes.aspxhttp://careerride.com/Data-warehousing-snow-flake.aspxhttp://careerride.com/Data-warehousing-analysis-service.aspxhttp://careerride.com/Data-warehousing-sequence-clustering.aspxhttp://careerride.com/Data-warehousing-discrete-continuous.aspxhttp://careerride.com/Data-warehousing-time-series-algorithm.aspxhttp://careerride.com/Data-warehousing-XMLA.aspxhttp://careerride.com/Data-warehousing-Business-Intelligence.aspxhttp://careerride.com/Data-warehousing-Dimensional-Modeling.aspx
  • 7/30/2019 dw FAQ

    2/77

    15) What is surrogate key? Explain it with an example.

    Answer - Data warehouses commonly use a surrogate key.......

    16) What is the purpose of Factless Fact Table?

    Answer - Fact less tables are so called because they simply contain........

    17) What is a level of Granularity of a fact table?

    Answer -A fact table is usually designed at a low level of Granularity...........

    18) Explain the difference between star and snowflake schemas.

    Answer -A snow flake schema design is usually more complex than a start schema.....

    19) What is the difference between view and materialized view?

    Answer -A view is created by combining data from different tables........

    20) What is a Cube and Linked Cube with reference to data warehouse?

    Answer -A data cube stores data in a summarized version which helps in a faster analysis............

    21) What is junk dimension?

    Answer - In scenarios where certain data may not be appropriate to store in....

    22) What are fundamental stages of Data Warehousing?

    Answer - Stages of a data warehouse helps to find and understand.......23) What is Virtual Data Warehousing?

    The aggregate view of complete data inventory is provided by Virtual Warehousing........

    24) What is active data warehousing?

    The transactional data captured and reposited in the Active Data Warehouse......

    25) List down differences between dependent data warehouse and independent data warehouse.

    Dependent data ware house are build........

    26) What is data modeling and data mining? What is this used for?

    Designing a model for data or database is called data modelling........

    27) Difference between ER Modeling and Dimensional Modeling.

    Dimensional modelling is very flexible for the user perspective........

    28) What is snapshot with reference to data warehouse?

    A snapshot of data warehouse is a persisted report from the catalogue.........

    29) What is degenerate dimension table?

    http://careerride.com/Data-warehousing-surrogate-key.aspxhttp://careerride.com/Data-warehousing-Factless-Fact-Table.aspxhttp://careerride.com/Data-warehousing-Granularity-fact-table.aspxhttp://careerride.com/Data-warehousing-star-snowflake.aspxhttp://careerride.com/Data-warehousing-materialized-view.aspxhttp://careerride.com/Data-warehousing-Cube-Linked-Cube.aspxhttp://careerride.com/Data-warehousing-junk-dimension.aspxhttp://careerride.com/Data-warehousing-fundamental-stages.aspxhttp://careerride.com/Data-warehousing-what-is-virtual-data.aspxhttp://careerride.com/Data-warehousing-what-is-active-data.aspxhttp://careerride.com/Data-warehousing-dependent-data-independent-data.aspxhttp://careerride.com/Data-warehousing-data-modeling-and-data-mining.aspxhttp://careerride.com/Data-warehousing-ER-modeling-and-dimensional-modeling.aspxhttp://careerride.com/Data-warehousing-what-is-snapshot.aspxhttp://careerride.com/Data-warehousing-what-is-degenerate-dimension-table.aspxhttp://careerride.com/Data-warehousing-surrogate-key.aspxhttp://careerride.com/Data-warehousing-Factless-Fact-Table.aspxhttp://careerride.com/Data-warehousing-Granularity-fact-table.aspxhttp://careerride.com/Data-warehousing-star-snowflake.aspxhttp://careerride.com/Data-warehousing-materialized-view.aspxhttp://careerride.com/Data-warehousing-Cube-Linked-Cube.aspxhttp://careerride.com/Data-warehousing-junk-dimension.aspxhttp://careerride.com/Data-warehousing-fundamental-stages.aspxhttp://careerride.com/Data-warehousing-what-is-virtual-data.aspxhttp://careerride.com/Data-warehousing-what-is-active-data.aspxhttp://careerride.com/Data-warehousing-dependent-data-independent-data.aspxhttp://careerride.com/Data-warehousing-data-modeling-and-data-mining.aspxhttp://careerride.com/Data-warehousing-ER-modeling-and-dimensional-modeling.aspxhttp://careerride.com/Data-warehousing-what-is-snapshot.aspxhttp://careerride.com/Data-warehousing-what-is-degenerate-dimension-table.aspx
  • 7/30/2019 dw FAQ

    3/77

    30) What is Data Mart?

    Data Mart is a data repository which is served to a community of people.......

    31) What is the difference between metadata and data dictionary?

    Metadata describes about data. It is data about data. It has information about how and when......

    32) Describe the various methods of loading Dimension tables.

    The following are the methods of loading dimension tables.......

    33) What is the difference between OLAP and data warehouse?

    The following are the differences between OLAP and data warehousing......

    34) Describe the foreign key columns in fact table and dimension table.

    The primary keys of entity tables are the foreign keys of dimension tables.......

    35) What is cube grouping?

    A transformer built set of similar cubes is known as cube grouping. A single level in one dimension......

    36) Define the term slowly changing dimensions (SCD).

    Slowly changing dimension target operator is one of the SQL warehousing operators......

    37) What is a Star Schema?

    The simplest data warehousing schema is star schema.......

    38) Differences between star and snowflake schema.

    Star Schema: A de-normalized technique in which one fact table is associated with several dimension tables.......

    39) Explain the use of lookup tables and Aggregate tables.

    At the time of updating the data warehouse, a lookup table is used.......

    40) What is real time data-warehousing?

    The combination of real-time activity and data warehousing is called real time warehousing.......

    41) What is conformed fact? What is conformed dimensions use for?

    Allowing having same names in different tables is allowed by Conformed facts.......

    42) Define non-additive facts.

    The facts that can not be summed up for the dimensions present in the fact table are called non-additive facts.......

    43) Define BUS Schema.

    A BUS schema is to identify the common dimensions across business processes......

    44) List out difference between SAS tool and other tools.

    http://careerride.com/Data-warehousing-what-is-data-mart.aspxhttp://careerride.com/Data-warehousing-metadata-and-data-dictionary.aspxhttp://careerride.com/Data-warehousing-methods-of-loading-dimension-tables.aspxhttp://careerride.com/Data-warehousing-OLAP-and-data-warehouse.aspxhttp://careerride.com/Data-warehousing-foreign-key-columns.aspxhttp://careerride.com/Data-warehousing-what-is-cube-grouping.aspxhttp://careerride.com/Data-warehousing-slowly-changing-dimensions-(SCD).aspxhttp://careerride.com/Data-warehousing-what-is-a-star-schema.aspxhttp://careerride.com/Data-warehousing-star-and-snowflake-schema.aspxhttp://careerride.com/Data-warehousing-lookup-tables-and-Aggregate-tables.aspxhttp://careerride.com/Data-warehousing-real-time.aspxhttp://careerride.com/Data-warehousing-conformed-fact-dimensions.aspxhttp://careerride.com/Data-warehousing-non-additive-facts.aspxhttp://careerride.com/Data-warehousing-non-additive-facts.aspxhttp://careerride.com/Data-warehousing-BUS-schema.aspxhttp://careerride.com/Data-warehousing-BUS-schema.aspxhttp://careerride.com/Data-warehousing-SAS-tool.aspxhttp://careerride.com/Data-warehousing-what-is-data-mart.aspxhttp://careerride.com/Data-warehousing-metadata-and-data-dictionary.aspxhttp://careerride.com/Data-warehousing-methods-of-loading-dimension-tables.aspxhttp://careerride.com/Data-warehousing-OLAP-and-data-warehouse.aspxhttp://careerride.com/Data-warehousing-foreign-key-columns.aspxhttp://careerride.com/Data-warehousing-what-is-cube-grouping.aspxhttp://careerride.com/Data-warehousing-slowly-changing-dimensions-(SCD).aspxhttp://careerride.com/Data-warehousing-what-is-a-star-schema.aspxhttp://careerride.com/Data-warehousing-star-and-snowflake-schema.aspxhttp://careerride.com/Data-warehousing-lookup-tables-and-Aggregate-tables.aspxhttp://careerride.com/Data-warehousing-real-time.aspxhttp://careerride.com/Data-warehousing-conformed-fact-dimensions.aspxhttp://careerride.com/Data-warehousing-non-additive-facts.aspxhttp://careerride.com/Data-warehousing-BUS-schema.aspxhttp://careerride.com/Data-warehousing-SAS-tool.aspx
  • 7/30/2019 dw FAQ

    4/77

    45) Why is SAS so popular?

    Statistical Analysis System is an integration of various software products which allows the developers to perform.......

    46) What is data cleaning? How can we do that?

    Data cleaning is also known as data scrubbing. Data cleaning is a process which ensures the set of data is correct and

    accurate......

    47) Explain in brief about critical column.

    A column (usually granular) is called as critical column which changes the values over a period of time.......

    48) What is data cube technology used for?

    Data cube is a multi-dimensional structure. Data cube is a data abstraction to view aggregated data from a number of

    perspectives

    49) What is Data Scheme?

    Data Scheme is a diagrammatic representation that illustrates data structures and data relationships to each other in the

    relational database within the data warehouse................

    50) What is Bit Mapped Index?

    Bitmap indexes make use of bit arrays (bitmaps) to answer queries by performing bitwise logical operations..................

    Read answer

    51) What is Bi-directional Extract?

    In hierarchical, networked or relational databases, the data can be extracted, cleansed and transferred in two directions. Theability of a system to do this is refered to as bidirectional extracts................

    Read answer

    52) What is Data Collection Frequency?

    Data collection frequency is the rate at which data is collected. However, the data is not just collected and stored. it goes through

    various stages of processing like extracting from various sources, cleansing, transforming and then storing in useful

    patterns................

    53) What is Data Cardinality?

    Cardinality is the term used in database relations to denote the occurrences of data on either side of the relation................Read answer

    54) What is Chained Data Replication?

    In Chain Data Replication, the non-official data set distributed among many disks provides for load balancing among the servers

    within the data warehouse...............

    Read answer

    55) What are Critical Success Factors?

    Key areas of activity in which favorable results are necessary for a company to reach its goal. There are four basic types of CSF

    which are:

    http://careerride.com/Data-warehousing-why-is-SAS-so-popular.aspxhttp://careerride.com/Data-warehousing-what-is-data-cleaning.aspxhttp://careerride.com/Data-warehousing-critical-column.aspxhttp://careerride.com/Data-warehousing-critical-column.aspxhttp://careerride.com/Data-warehousing-what-is-data-cube-technology.aspxhttp://careerride.com/dw-data-scheme.aspxhttp://careerride.com/dw-bit-mapped-index.aspxhttp://careerride.com/dw-bit-mapped-index.aspxhttp://careerride.com/dw-Bi-directional-extract.aspxhttp://careerride.com/dw-Bi-directional-extract.aspxhttp://careerride.com/dw-data-collection-frequency.aspxhttp://careerride.com/dw-data-cardinality.aspxhttp://careerride.com/dw-data-cardinality.aspxhttp://careerride.com/dw-chained-data-replication.aspxhttp://careerride.com/dw-chained-data-replication.aspxhttp://careerride.com/dw-critical-success-factors.aspxhttp://careerride.com/Data-warehousing-why-is-SAS-so-popular.aspxhttp://careerride.com/Data-warehousing-what-is-data-cleaning.aspxhttp://careerride.com/Data-warehousing-critical-column.aspxhttp://careerride.com/Data-warehousing-what-is-data-cube-technology.aspxhttp://careerride.com/dw-data-scheme.aspxhttp://careerride.com/dw-bit-mapped-index.aspxhttp://careerride.com/dw-bit-mapped-index.aspxhttp://careerride.com/dw-Bi-directional-extract.aspxhttp://careerride.com/dw-Bi-directional-extract.aspxhttp://careerride.com/dw-data-collection-frequency.aspxhttp://careerride.com/dw-data-cardinality.aspxhttp://careerride.com/dw-data-cardinality.aspxhttp://careerride.com/dw-chained-data-replication.aspxhttp://careerride.com/dw-chained-data-replication.aspxhttp://careerride.com/dw-critical-success-factors.aspx
  • 7/30/2019 dw FAQ

    5/77

    DATASTAGE QUESTIONS

    1. What is the flow of loading data into fact & dimensional tables?

    A) Fact table - Table with Collection of Foreign Keys corresponding to the Primary Keys in Dimensional

    table. Consists of fields with numeric values.

    Dimension table - Table with Unique Primary Key.

    Load - Data should be first loaded into dimensional table. Based on the primary key values in dimensiona

    table, the data should be loaded into Fact table.

    2. What is the default cache size? How do you change the cache size if needed?

    A. Default cache size is 256 MB. We can increase it by going into Datastage Administrator and selecting

    the Tunable Tab and specify the cache size over there.

    3. What are types of Hashed File?

    A) Hashed File is classified broadly into 2 types.

    a) Static - Sub divided into 17 types based on Primary Key Pattern.

    b) Dynamic - sub divided into 2 types

    i) Generic ii) Specific.

    Dynamic files do not perform as well as a well, designed static file, but do perform better than a badly

    designed one. When creating a dynamic file you can specify the following

    Although all of these have default values)

    By Default Hashed file is "Dynamic - Type Random 30 D"

    4. What does a Config File in parallel extender consist of?

    A) Config file consists of the following.

    a) Number of Processes or Nodes.

    b) Actual Disk Storage Location.

    5. What is Modulus and Splitting in Dynamic Hashed File?

    A. In a Hashed File, the size of the file keeps changing randomly.

    If the size of the file increases it is called as "Modulus".

    If the size of the file decreases it is called as "Splitting".

    6. What are Stage Variables, Derivations and Constants?

    A. Stage Variable - An intermediate processing variable that retains value during read and doesnt pass

    the value into target column.

    Derivation - Expression that specifies value to be passed on to the target column.

    Constant - Conditions that are either true or false that specifies flow of data with a link.

    7. Types of views in Datastage Director?There are 3 types of views in Datastage Director

  • 7/30/2019 dw FAQ

    6/77

    8. Types of Parallel Processing?

    A) Parallel Processing is broadly classified into 2 types.

    a) SMP - Symmetrical Multi Processing.

    b) MPP - Massive Parallel Processing.

    9. Orchestrate Vs Datastage Parallel Extender?A) Orchestrate itself is an ETL tool with extensive parallel processing capabilities and running on UNIX

    platform. Datastage used Orchestrate with Datastage XE (Beta version of 6.0) to incorporate the parallel

    processing capabilities. Now Datastage has purchased Orchestrate and integrated it with Datastage XE and

    released a new version Datastage 6.0 i.e Parallel Extender.

    10. Importance of Surrogate Key in Data warehousing?

    A) Surrogate Key is a Primary Key for a Dimension table. Most importance of using it is it is independent

    of underlying database. i.e. Surrogate Key is not affected by the changes going on with a database.

    11. How to run a Shell Script within the scope of a Data stage job?

    A) By using "ExcecSH" command at Before/After job properties.

    12. How to handle Date conversions in Datastage? Convert a mm/dd/yyyy format to yyyy-dd-mm?

    A) We use a) "Iconv" function - Internal Conversion.

    b) "Oconv" function - External Conversion.

    Function to convert mm/dd/yyyy format to yyyy-dd-mm is

    Oconv(Iconv(Filedname,"D/MDY[2,2,4]"),"D-MDY[2,2,4]")

    13 How do you execute datastage job from command line prompt?

    A) Using "dsjob" command as follows.

    dsjob -run -jobstatus projectname jobname

    14. Functionality of Link Partitioner and Link Collector?

    Link Partitioner: It actually splits data into various partitions or data flows using various partition

    methods.

    Link Collector: It collects the data coming from partitions, merges it into a single data flow and loads to

    target.

    15. Types of Dimensional Modeling?

    A) Dimensional modeling is again sub divided into 2 types.

    a) Star Schema - Simple & Much Faster. Denormalized form.

    b) Snowflake Schema - Complex with more Granularity. More normalized form.

    16. Differentiate Primary Key and Partition Key?

    Primary Key is a combination of unique and not null. It can be a collection of key values called as

    composite primary key. Partition Key is a just a part of Primary Key. There are several methods of

    partition like Hash, DB2, and Random etc. While using Hash partition we specify the Partition Key.

  • 7/30/2019 dw FAQ

    7/77

    A) Data in a Database is

    a) Detailed or Transactional

    b) Both Readable and Writable.

    c) Current.

    18. Containers Usage and Types?Container is a collection of stages used for the purpose of Reusability.

    There are 2 types of Containers.

    a) Local Container: Job Specific

    b)Shared Container: Used in any job within a project.

    19. Compare and Contrast ODBC and Plug-In stages?

    ODBC: a) Poor Performance.

    b) Can be used for Variety of Databases.

    c) Can handle Stored Procedures.

    Plug-In: a) Good Performance.

    b) Database specific. (Only one database)

    c) Cannot handle Stored Procedures.

    20. Dimension Modelling types along with their significance

    Data Modelling is Broadly classified into 2 types.

    a) E-R Diagrams (Entity - Relatioships).

    b) Dimensional Modelling.

    Q 21 What are Ascential Dastastage Products, Connectivity

    Ans:

    Ascential Products

    Ascential DataStage

    Ascential DataStage EE (3)

    Ascential DataStage EE MVS

    Ascential DataStage TX

    Ascential QualityStage

    Ascential MetaStage

    Ascential RTI (2)

    Ascential ProfileStage

    Ascential AuditStage

    Ascential Commerce Manager

    Industry Solutions

  • 7/30/2019 dw FAQ

    8/77

    Files

    RDBMS

    Real-time

    PACKs

    EDIOther

    Explain Data Stage Architecture?

    Data Stage contains two components,

    Client Component.

    Server Component.

    Client Component:

    Data Stage Administrator.

    Data Stage Manager

    Data Stage Designer

    Data Stage Director

    Server Components:

    Data Stage Engine

    Meta Data Repository

    Package Installer

    Data Stage Administrator:

    Used to create the project.

    Contains set of properties

    We can set the buffer size (by default 128 MB)

    We can increase the buffer size.We can set the Environment Variables.

    In tunable we have in process and inter-process

    In-processData read in sequentially

    Inter-process It reads the data as it comes.

    It just interfaces to metadata.

    Data Stage Manager:

    We can view and edit the Meta data Repository.

    We can import table definitions.

  • 7/30/2019 dw FAQ

    9/77

    Data Stage Designer:

    We can create the jobs. We can compile the job. We can run the job. We can declare stage variable

    in transform, we can call routines, transform, macros, functions.

    We can write constraints.

    Data Stage Director:We can run the jobs.

    We can schedule the jobs. (Schedule can be done daily, weekly, monthly, quarterly)

    We can monitor the jobs.

    We can release the jobs.

    What is Meta Data Repository?

    Meta Data is a data about the data.

    It also contains

    Query statistics

    ETL statistics

    Business subject area

    Source Information

    Target Information

    Source to Target mapping Information.

    What is Data Stage Engine?

    It is a JAVA engine running at the background.

    What is Dimensional Modeling?

    Dimensional Modeling is a logical design technique that seeks to present the data in a standard

    framework that is, intuitive and allows for high performance access.

    What is Star Schema?

    Star Schema is a de-normalized multi-dimensional model. It contains centralized fact tables surrounded by

    dimensions table.Dimension Table: It contains a primary key and description about the fact table.

    Fact Table: It contains foreign keys to the dimension tables, measures and aggregates.

    What is surrogate Key?

    It is a 4-byte integer which replaces the transaction / business / OLTP key in the dimension table

    We can store up to 2 billion record.

    Why we need surrogate key?

    It is used for integrating the data may help better for primary key.

    Index maintenance, joins, table size, key updates, disconnected inserts and partitioning.

  • 7/30/2019 dw FAQ

    10/77

    Explain Types of Fact Tables?

    Factless Fact: It contains only foreign keys to the dimension tables.

    Additive Fact: Measures can be added across any dimensions.

    Semi-Additive: Measures can be added across some dimensions. Eg, % age, discount

    Non-Additive: Measures cannot be added across any dimensions. Eg, AverageConformed Fact: The equation or the measures of the two fact tables are the same under the facts are

    measured across the dimensions with a same set of measures.

    Explain the Types of Dimension Tables?

    Conformed Dimension: If a dimension table is connected to more than one fact table, the granularity

    that is defined in the dimension table is common across between the fact tables.

    Junk Dimension: The Dimension table, which contains only flags.

    Monster Dimension: If rapidly changes in Dimension are known as Monster Dimension.

    De-generative Dimension: It is line item-oriented fact table design.

    Q 22 What are stage variables?

    Stage variables are declaratives in Transformer Stage used to store values. Stage variables are active at the

    run time. (Because memory is allocated at the run time).

    Q 23 What is sequencer?

    It sets the sequence of execution of server jobs.

    Q 24 What are Active and Passive stages?

    Active Stage: Active stage model the flow of data and provide mechanisms for combining data streams

    aggregating data and converting data from one data type to another. Eg, Transformer, aggregator, sort

    Row Merger etc.

    Passive Stage: A Passive stage handles access to Database for the extraction or writing of data. Eg, IPC

    stage, File types, Universe, Unidata, DRS stage etc.

    Q 25 What is ODS?

    Operational Data Store is a staging area where data can be rolled back.

    Q 26 What are Macros?

    They are built from Data Stage functions and do not require arguments.

    A number of macros are provided in the JOBCONTROL.H file to facilitate getting information about th

    current job, and links and stages belonging to the current job. These can be used in expressions (for

  • 7/30/2019 dw FAQ

    11/77

    These macros provide the functionality of using the DSGetProjectInfo,DSGetJobInfo, DSGetStageInfo,

    andDSGetLinkInfo functions with the DSJ.ME token as theJobHandle and can be used in all active

    stages and before/after subroutines. The macros provide the functionality for all the possibleInfoType

    arguments for the DSGetInfo functions. See the Function call help topics for more details.

    The available macros are: DSHostName

    DSProjectName

    DSJobStatus

    DSJobName

    DSJobController

    DSJobStartDate

    DSJobStartTime

    DSJobStartTimestamp

    DSJobWaveNo

    DSJobInvocations

    DSJobInvocationId DSStageName

    DSStageLastErr

    DSStageType

    DSStageInRowNum

    DSStageVarList DSLinkRowCount

    DSLinkLastErr

    DSLinkName

    1) Examples

    2) To obtain the name of the current job:

    3) MyName = DSJobName

    To obtain the full current stage name:

    MyName = DSJobName : . : DSStageName

    Q 27 What is keyMgtGetNextValue?

    It is a Built-in transform it generates Sequential numbers. Its input type is literal string & output type i

    string.

  • 7/30/2019 dw FAQ

    12/77

    The stages are either passive or active stages. Passive stages handle

    access to databases for extracting or writing data. Active stages model the flow of data and

    provide mechanisms for combining data streams, aggregating data, and converting data from one data type

    to another.

    Q 29 What index is created on Data Warehouse?

    Bitmap index is created in Data Warehouse.

    Q 30 What is container?

    A container is a group of stages and links. Containers enable you to simplify and modularize your server

    job designs by replacing complex areas of the diagram with a single container stage. You can also use

    shared containers as a way of incorporating server job functionality into parallel jobs.

    DataStage provides two types of container:

    Local containers. These are created within a job and are only accessible by that job. A local

    container is edited in a tabbed page of the jobs Diagram window.

    Shared containers. These are created separately and are stored in the Repository in the same way

    that jobs are. There are two types of shared container

    Q 31 What is function? ( Job Control Examples of Transform Functions )

    Functions take arguments and return a value.

    BASIC functions: A function performs mathematical or string manipulations on the arguments

    supplied to it, and return a value. Some functions have 0 arguments; most have 1 or more.

    Arguments are always in parentheses, separated by commas, as shown in this general syntax:

    FunctionName (argument,argument)

    DataStage BASIC functions: These functions can be used in a job control routine, which is

    defined as part of a jobs properties and allows other jobs to be run and controlled from the first

    job. Some of the functions can also be used for getting status information on the current job; these

    are useful in active stage expressions and before- and after-stage subroutines.

    Q 32 What is Routines?

    Routines are stored in the Routines branch of the Data Stage Repository, where you can create, view o

    edit. The following programming components are classified as routines:

    Transform functions, Before/After subroutines, Custom UniVerse functions, ActiveX (OLE) functions

    Web Service routines

  • 7/30/2019 dw FAQ

    13/77

  • 7/30/2019 dw FAQ

    14/77

    Fact table - Table with Collection of Foreign Keys corresponding to the Primary Keys in Dimensiona

    table. Consists of fields with numeric values.

    Dimension table - Table with Unique Primary Key.

    Load - Data should be first loaded into dimensional table. Based on the primary key values in

    dimensional table, then data should be loaded into Fact table.

    Question: Orchestrate Vs Datastage Parallel Extender?

    Answer:

    Orchestrate itself is an ETL tool with extensive parallel processing capabilities and running on UNIX

    platform. Datastage used Orchestrate with Datastage XE (Beta version of 6.0) to incorporate the paralle

    processing capabilities. Now Datastage has purchased Orchestrate and integrated it with Datastage XE and

    released a new version Datastage 6.0 i.e. Parallel Extender.

    Question: Differentiate Primary Key and Partition Key?

    Answer:

    Primary Key is a combination of unique and not null. It can be a collection of key values called as

    composite primary key. Partition Key is a just a part of Primary Key. There are several methods o

    partition like Hash, DB2, Random etc...While using Hash partition we specify the Partition Key.

    Question: What are Stage Variables, Derivations and Constants?

    Answer:

    Stage Variable - An intermediate processing variable that retains value during read and doesnt pass the

    value into target column.

    Constraint - Conditions that are either true or false that specifies flow of data with a link.

    Derivation - Expression that specifies value to be passed on to the target column.

    Question: What is the default cache size? How do you change the cache size if needed?

    Answer:

    Default cache size is 256 MB. We can increase it by going into Datastage Administrator and selecting the

    Tunable Tab and specify the cache size over there.

    Question: What is Hash file stage and what is it used for?

    Answer:

    Used for Look-ups. It is like a reference table. It is also used in-place of ODBC, OCI tables for bette

    performance.

  • 7/30/2019 dw FAQ

    15/77

    Hashed File is classified broadly into 2 types.

    A) Static - Sub divided into 17 types based on Primary Key Pattern.

    B) Dynamic - sub divided into 2 types

    i) Generic

    ii) SpecificDefault Hased file is "Dynamic - Type Random 30 D"

    Question: What are Static Hash files and Dynamic Hash files?

    Answer:

    As the names itself suggest what they mean. In general we use Type-30 dynamic Hash files. The Data file

    has a default size of 2GB and the overflow file is used if the data exceeds the 2GB size.

    Question: What is the Usage of Containers? What are its types?

    Answer:

    Container is a collection of stages used for the purpose of Reusability.

    There are 2 types of Containers.

    A) Local Container: Job Specific

    B) Shared Container: Used in any job within a project.

    Question: Compare and Contrast ODBC and Plug-In stages?

    Answer:

    ODBC PLUG-IN

    Poor Performance Good Performance

    Can be used for Variety of Databases Database Specific (only one database)

    Can handle Stored Procedures Cannot handle Stored Procedures

    Question: How do you execute datastage job from command line prompt?

    Answer:

    Using "dsjob" command as follows.

    dsjob -run -jobstatus projectname jobname

    Question: What are the command line functions that import and export the DS jobs?

    Answer:

    dsimport.exe - imports the DataStage components.

    dsexport.exe - exports the DataStage components.

    Question: How to run a Shell Script within the scope of a Data stage job?

  • 7/30/2019 dw FAQ

    16/77

    Question: What are OConv () and Iconv () functions and where are they used?

    Answer:

    IConv() - Converts a string to an internal storage format

    OConv() - Converts an expression to an output format.

    Question: How to handle Date convertions in Datastage? Convert mm/dd/yyyy format to yyyy-dd

    mm?

    Answer:

    We use

    a) "Iconv" function - Internal Convertion.

    b) "Oconv" function - External Convertion.

    Function to convert mm/dd/yyyy format to yyyy-dd-mm is

    Oconv(Iconv(Filedname,"D/MDY[2,2,4]"),"D-MDY[2,2,4]")

    Question: Types of Parallel Processing?

    Answer:

    Parallel Processing is broadly classified into 2 types.

    a) SMP - Symmetrical Multi Processing.

    b) MPP - Massive Parallel Processing.

    Question: What does a Config File in parallel extender consist of?

    Answer:

    Config file consists of the following.

    a) Number of Processes or Nodes.

    b) Actual Disk Storage Location.

    Question: Functionality of Link Partitioner and Link Collector?

    Answer:

    Link Partitioner: It actually splits data into various partitions or data flows using various

    Partition methods.

    Link Collector: It collects the data coming from partitions, merges it into a single data flow and loads to

    target.

  • 7/30/2019 dw FAQ

    17/77

    In a Hashed File, the size of the file keeps changing randomly.

    If the size of the file increases it is called as "Modulus".

    If the size of the file decreases it is called as "Splitting".

    Question: Types of views in Datastage Director?Answer:

    There are 3 types of views in Datastage Director

    a) Job View - Dates of Jobs Compiled.

    b) Log View - Status of Job last Run

    c) Status View - Warning Messages, Event Messages, Program Generated Messages.

    Question: Did you Parameterize the job or hard-coded the values in the jobs?

    Answer:

    Always parameterized the job. Either the values are coming from Job Properties or from a Paramete

    Manager a third part tool. There is no way you will hardcode some parameters in your jobs. The often

    Parameterized variables in a job are: DB DSN name, username, password, dates W.R.T for the data to be

    looked against at.

    Question: Have you ever involved in updating the DS versions like DS 5.X, if so tell us some the

    steps you have taken in doing so?

    Answer:

    Yes.

    The following are some of the steps:

    Definitely take a back up of the whole project(s) by exporting the project as a .dsx file

    See that you are using the same parent folder for the new version also for your old jobs using the hard-

    coded file path to work.

    After installing the new version import the old project(s) and you have to compile them all again. You can

    use 'Compile All' tool for this.

    Make sure that all your DB DSN's are created with the same name as old ones. This step is for moving DS

    from one machine to another.

    In case if you are just upgrading your DB from Oracle 8i to Oracle 9i there is tool on DS CD that can do

    this for you.

    Do not stop the 6.0 server before the upgrade, version 7.0 install process collects project information

    during the upgrade. There is NO rework (recompilation of existing jobs/routines) needed after th

    upgrade.

  • 7/30/2019 dw FAQ

    18/77

    Answer:

    Typically a Reject-link is defined and the rejected data is loaded back into data warehouse. So Reject link

    has to be defined every Output link you wish to collect rejected data. Rejected data is typically bad dat

    like duplicates of Primary keys or null-rows where data is expected.

    Question: What are other Performance tunings you have done in your last project to increase the

    performance of slowly running jobs?

    Answer:

    Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server usin

    Hash/Sequential files for optimum performance also for data recovery in case job aborts.

    Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for faster inserts

    updates and selects. Tuned the 'Project Tunables' in Administrator for better performance.

    Used sorted data for Aggregator.

    Sorted the data as much as possible in DB and reduced the use of DS-Sort for better performance o

    jobs.

    Removed the data not used from the source as early as possible in the job.

    Worked with DB-admin to create appropriate Indexes on tables for better performance of DS queries.

    Converted some of the complex joins/business in DS to Stored Procedures on DS for faster execution

    of the jobs.

    If an input file has an excessive number of rows and can be split-up then use standard logic to run jobs

    in parallel.

    Before writing a routine or a transform, make sure that there is not the functionality required in one o

    the standard routines supplied in the sdk or ds utilities categories.

    Constraints are generally CPU intensive and take a significant amount of time to process. This may be

    the case if the constraint calls routines or external macros but if it is inline code then the overhead wil

    be minimal.

    Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate th

    unnecessary records even getting in before joins are made.

    Tuning should occur on a job-by-job basis.

    Use the power of DBMS.

    Try not to use a sort stage when you can use an ORDER BY clause in the database.

    Using a constraint to filter a record set is much slower than performing a SELECT WHERE.

    Make every attempt to use the bulk loader for your particular database Bulk loaders are generally

  • 7/30/2019 dw FAQ

    19/77

    Question: Tell me one situation from your last project, where you had faced problem and How did u

    solve it?

    Answer:

    1. The jobs in which data is read directly from OCI stages are running extremely slow. I had to stage th

    data before sending to the transformer to make the jobs run faster.2. The job aborts in the middle of loading some 500,000 rows. Have an option either cleaning/deleting

    the loaded data and then run the fixed job or run the job again from the row the job has aborted. To

    make sure the load is proper we opted the former.

    Question: Tell me the environment in your last projects

    Answer:

    Give the OS of the Server and the OS of the Client of your recent most project

    Question: How did u connect with DB2 in your last project?

    Answer:

    Most of the times the data was sent to us in the form of flat files. The data is dumped and sent to us. In

    some cases were we need to connect to DB2 for look-ups as an instance then we used ODBC drivers to

    connect to DB2 (or) DB2-UDB depending the situation and availability. Certainly DB2-UDB is better in

    terms of performance as you know the native drivers are always better than ODBC drivers. 'iSeries Access

    ODBC Driver 9.00.02.02' - ODBC drivers to connect to AS400/DB2.

    Question: What are Routines and where/how are they written and have you written any routine

    before?

    Answer:

    Routines are stored in the Routines branch of the DataStage Repository, where you can create, view or

    edit.

    The following are different types of Routines:

    1. Transform Functions

    2. Before-After Job subroutines

    3. Job Control Routines

    Question: How did you handle an 'Aborted' sequencer?

    Answer:

    In almost all cases we have to delete the data inserted by this from DB manually and fix the job and then

    run the job again.

  • 7/30/2019 dw FAQ

    20/77

    Question: Read the String functions in DS

    Answer:

    Functions like [] -> sub-string function and ':' -> concatenation operator

    Syntax:

    string [ [ start, ] length ]

    string [ delimiter, instance, repeats ]

    Question: What will you in a situation where somebody wants to send you a file and use that file as

    an input or reference and then run job.

    Answer:

    Under Windows : Use the 'WaitForFileActivity' under the Sequencers and then run the job. May be you

    can schedule the sequencer around the time the file is expected to arrive.

    Under UNIX : Poll for the file. Once the file has start the job or sequencer depending on the file.

    Question: What is the utility you use to schedule the jobs on a UNIX server other than using

    Ascential Director?

    Answer:

    Use crontab utility along with dsexecute() function along with proper parameters passed.

    Question: Did you work in UNIX environment?

    Answer:

    Yes. One of the most important requirements.

    Question: How would call an external Java function which are not supported by DataStage?

    Answer:

    Starting from DS 6.0 we have the ability to call external Java functions using a Java package from

    Ascential. In this case we can even use the command line to invoke the Java function and write the return

    values from the Java program (if any) and use that files as a source in DataStage job.

    Question: How will you determine the sequence of jobs to load into data warehouse?

    Answer:

    First we execute the jobs that load the data into Dimension tables, then Fact tables, then load the

    Aggregator tables (if any).

    Q ti Th b i ht i th ti Wh d h t l d th di i l t bl

  • 7/30/2019 dw FAQ

    21/77

    As we load the dimensional tables the keys (primary) are generated and these keys (primary) are Foreign

    keys in Fact tables.

    Question: Does the selection of 'Clear the table and Insert rows' in the ODBC stage send a Truncat

    statement to the DB or does it do some kind of Delete logic.

    Answer:

    There is no TRUNCATE on ODBC stages. It is Clear table blah blah and that is a delete from statement

    On an OCI stage such as Oracle, you do have both Clear and Truncate options. They are radically differen

    in permissions (Truncate requires you to have alter table permissions where Delete doesn't).

    Question: How do you rename all of the jobs to support your new File-naming conventions?

    Answer:

    Create an Excel spreadsheet with new and old names. Export the whole project as a dsx. Write a Per

    program, which can do a simple rename of the strings looking up the Excel file. Then import the new dsx

    file probably into a new project for testing. Recompile all jobs. Be cautious that the name of the jobs ha

    also been changed in your job control jobs or Sequencer jobs. So you have to make the necessary change

    to these Sequencers.

    Question: When should we use ODS?

    Answer:

    DWH's are typically read only, batch updated on a schedule

    ODS's are maintained in more real time, trickle fed constantly

    Question: What other ETL's you have worked with?

    Answer:

    Informatica and also DataJunction if it is present in your Resume.

    Question: How good are you with your PL/SQL?

    Answer:

    On the scale of 1-10 say 8.5-9

    Question: What versions of DS you worked with?

    Answer:

    DS 7.5, DS 7.0.2, DS 6.0, DS 5.2

  • 7/30/2019 dw FAQ

    22/77

    Datastage developer is one how will code the jobs. Datastage designer is how will design the job, I mean

    he will deal with blue prints and he will design the jobs the stages that are required in developing the code

    Question: What are the requirements for your ETL tool?

    Answer:Do you have large sequential files (1 million rows, for example) that need to be compared every day

    versus yesterday?

    If so, then ask how each vendor would do that. Think about what process they are going to do. Are they

    requiring you to load yesterdays file into a table and do lookups?

    If so, RUN!! Are they doing a match/merge routine that knows how to process this in sequential files?

    Then maybe they are the right one. It all depends on what you need the ETL to do.

    If you are small enough in your data sets, then either would probably be OK.

    Question: What are the main differences between Ascential DataStage and Informatica

    PowerCenter?

    Answer:

    Chuck Kelleys Answer: You are right; they have pretty much similar functionality. However, what ar

    the requirements for your ETL tool? Do you have large sequential files (1 million rows, for example) tha

    need to be compared every day versus yesterday? If so, then ask how each vendor would do that. Think

    about what process they are going to do. Are they requiring you to load yesterdays file into a table and do

    lookups? If so, RUN!! Are they doing a match/merge routine that knows how to process this in sequentia

    files? Then maybe they are the right one. It all depends on what you need the ETL to do. If you are smal

    enough in your data sets, then either would probably be OK.

    Les Barbusinskis Answer: Without getting into specifics, here are some differences you may want to

    explore with each vendor:

    Does the tool use a relational or a proprietary database to store its Meta data and scripts? I

    proprietary, why?

    What add-ons are available for extracting data from industry-standard ERP, Accounting, and CRM

    packages?

    Can the tools Meta data be integrated with third-party data modeling and/or business intelligenc

    tools? If so, how and with which ones?

    How well does each tool handle complex transformations, and how much external scripting i

    i d?

  • 7/30/2019 dw FAQ

    23/77

    Almost any ETL tool will look like any other on the surface. The trick is to find out which one will work

    best in your environment. The best way Ive found to make this determination is to ascertain how

    successful each vendors clients have been using their product. Especially clients who closely resemble

    your shop in terms of size, industry, in-house skill sets, platforms, source systems, data volumes and

    transformation complexity.

    Ask both vendors for a list of their customers with characteristics similar to your own that have used thei

    ETL product for at least a year. Then interview each client (preferably several people at each site) with an

    eye toward identifying unexpected problems, benefits, or quirkiness with the tool that have been

    encountered by that customer. Ultimately, ask each customer if they had it all to do over again whether

    or not theyd choose the same tool and why? You might be surprised at some of the answers.

    Joyce Bischoffs Answer: You should do a careful research job when selecting products. You should firs

    document your requirements, identify all possible products and evaluate each product against the detaile

    requirements. There are numerous ETL products on the market and it seems that you are looking at only

    two of them. If you are unfamiliar with the many products available, you may refer to www.tdan.com, th

    Data Administration Newsletter, for product lists.

    If you ask the vendors, they will certainly be able to tell you which of their products features are stronge

    than the other product. Ask both vendors and compare the answers, which may or may not be totally

    accurate. After you are very familiar with the products, call their references and be sure to talk with

    technical people who are actually using the product. You will not want the vendor to have a representative

    present when you speak with someone at the reference site. It is also not a good idea to depend upon

    high-level manager at the reference site for a reliable opinion of the product. Managers may paint a very

    rosy picture of any selected product so that they do not look like they selected an inferior product.

    Question: How many places u can call Routines?

    Answer:

    Four Places u can call

    1. Transform of routine

    a. Date Transformation

    b. Upstring Transformation

    2. Transform of the Before & After Subroutines

    3. XML transformation

    4. Web base transformation

    http://www.tdan.com/http://www.tdan.com/
  • 7/30/2019 dw FAQ

    24/77

    are generate depends your job nature either simple job or sequencer job, you can see this program on job

    control option.

    Question:Suppose that 4 job control by the sequencer like (job 1, job 2, job 3, job 4 ) if job 1 have

    10,000 row ,after run the job only 5000 data has been loaded in target table remaining are noloaded and your job going to be aborted then.. How can short out the problem?

    Answer:

    Suppose job sequencer synchronies or control 4 job but job 1 have problem, in this condition should go

    director and check it what type of problem showing either data type problem, warning massage, job fail or

    job aborted, If job fail means data type problem or missing column action .So u should go Run window

    ->Click-> Tracing->Performance or In your target table ->general -> action-> select this option here two

    option

    (i) On Fail -- Commit , Continue

    (ii) On Skip -- Commit, Continue.

    First u check how much data already load after then select on skip option then continue and wha

    remaining position data not loaded then select On Fail , Continue ...... Again Run the job defiantly u

    gets successful massage

    Question: What happens if RCP is disable?

    Answer:

    In such case OSH has to perform Import and export every time when the job runs and the processing tim

    job is also increased...

    Question: How do you rename all of the jobs to support your new File-naming conventions?

    Answer: Create a Excel spreadsheet with new and old names. Export the whole project as a dsx. Write

    Perl program, which can do a simple rename of the strings looking up the Excel file. Then import the new

    dsx file probably into a new project for testing. Recompile all jobs. Be cautious that the name of the job

    has also been changed in your job control jobs or Sequencer jobs. So you have to make the necessary

    changes to these Sequencers.

    Question: What will you in a situation where somebody wants to send you a file and use that file as

    an input or reference and then run job.

    Answer: A. Under Windows: Use the 'WaitForFileActivity' under the Sequencers and then run the job

    May be you can schedule the sequencer around the time the file is expected to arrive

  • 7/30/2019 dw FAQ

    25/77

    Question: What are Sequencers?

    Answer: Sequencers arejob control programs that execute other jobs with preset Job parameters.

    Question: How did you handle an 'Aborted' sequencer?

    Answer: In almost all cases we have to delete the data inserted by this from DB manually and fix the job

    and then run the job again.

    Question34: What is the difference between the Filter stage and the Switch stage?

    Ans: There are two main differences, and probably some minor ones as well. The two main differences

    are as follows.

    1) The Filter stage can send one input row to more than one output link. The Switch stage can not - th

    Cswitch construct has an implicit breakin every case.

    2) The Switch stage is limited to 128 output links; the Filter stage can have a theoretically unlimited

    number of output links. (Note: this is not a challenge!)

    Question: How can i achieve constraint based loadingusing datastage7.5.My target tables have inte

    dependencies i.e. Primary key foreign key constraints. I want my primary key tables to be loaded first an

    then my foreign key tables and also primary key tables should be committed before the foreign key table

    are executed. How can I go about it?

    Ans:1) Create a Job Sequencer to load you tables in Sequential mode

    In the sequencer Call all Primary Key tables loading Jobs first and followed by Foreign key tables, when

    triggering the Foreign tables load Job trigger them only when Primary Key load Jobs run Successfully

    ( i.e. OK trigger)

    2) To improve the performance of the Job, you can disable all the constraints on the tables and load them

    Once loading done, check for the integrity of the data. Which does not meet raise exceptional data and

    cleanse them.

    This only a suggestion, normally when loading on constraints are up, will drastically performance will go

    down.

    3) If you use Star schema modeling, when you create physical DB from the model, you can delete al

    constraints and the referential integrity would be maintained in the ETL process by referring all youdimension keys while loading fact tables. Once all dimensional keys are assigned to a fact then dimension

  • 7/30/2019 dw FAQ

    26/77

    Question: How do you merge two files in DS?

    Ans: Either use Copy command as a Before-job subroutine if the metadata of the 2 files are same o

    create a job to concatenate the 2 files into one, if the metadata is different.

    Question: How do you eliminate duplicate rows?

    Ans: Data Stage provides us with a stage Remove Duplicates in Enterprise edition. Using that stage w

    can eliminate the duplicates based on a key column.

    Question: How do you pass filename as the parameter for a job?

    Ans: While job development we can create a parameter 'FILE_NAME' and the value can be passed while

    Question: How did you handle an 'Aborted' sequencer?

    Ans: In almost all cases we have to delete the data inserted by this from DB manually and fix the job and

    then run the job again.

    Question:Is there a mechanism available to export/import individual DataStage ETL jobs from the

    UNIX command line?

    Ans: Try dscmdexport and dscmdimport. Won't handle the "individual job" requirement. You can only export full projects from th

    command line.

    You can find the export and import executables on the client machine usually someplace like: C:\Program Files\Ascential\DataStage.

    Question: Diff. between JOIN stage and MERGE stage.

    Answer:

    JOIN: Performs join operations on two or more data sets input to the stage and then outputs the resultin

    dataset.

    MERGE: Combines a sorted master data set with one or more sorted updated data sets. The columns

    from the records in the master and update data set s are merged so that the out put record contains all the

    columns from the master record plus any additional columns from each update record that required.

    A master record and an update record are merged only if both of them have the same values for the merge

    key column(s) that we specify .Merge key columns are one or more columns that exist in both the maste

    and update records.

  • 7/30/2019 dw FAQ

    27/77

    Question: Advantages of the DataStage?

    Answer:

    Business advantages:

    Helps for better business decisions;

    It is able to integrate data coming from all parts of the company;

    It helps to understand the new and already existing clients;

    We can collect data of different clients with him, and compare them;

    It makes the research of new business possibilities possible;

    We can analyze trends of the data read by him.

    Technological advantages:

    It handles all company data and adapts to the needs;

    It offers the possibility for the organization of a complex business intelligence;

    Flexibly and scalable;

    It accelerates the running of the project;

    Easily implementable.

    DATASTAGE FAQ

    1. What is the architecture of data stage?

    Basically architecture of DS is client/server architecture.

    Client components & server components

    Client components are 4 types they are

    1. Data stage designer

    2. Data stage administrator

    3. Data stage director

    4. Data stage manager

  • 7/30/2019 dw FAQ

    28/77

    Data stage manager is used for to import & export the project to view & edit the contents of the

    repository.

    Data stage administrator is used for creating the project, deleting the project & setting the environmen

    variables.

    Data stage director is use for to run the jobs, validate the jobs, scheduling the jobs.

    Server components

    DS server: runs executable server jobs, under the control of the DS director, that extract, transform, and

    load data into a DWH.

    DS Package installer: A user interface used to install packaged DS jobs and plug-in;

    Repository or project: a central store that contains all the information required to build DWH or dat

    mart.

    2. What r the stages u worked on?

    3. I have some jobs every month automatically delete the log details what r the steps u have to take

    for that

    We have to set the option autopurge in DS Adminstrator.

    4. I want to run the multiple jobs in the single job. How can u handle.

    In job properties set the option ALLOW MULTIPLE INSTANCES.

    5. What is version controlling in DS?

    In DS, version controlling is used for back up the project or jobs.

    This option is available in DS 7.1 version onwards.

    Version controls r of 2 types.

    1. VSS- visual source safe

  • 7/30/2019 dw FAQ

    29/77

    VSS is designed by Microsoft but the disadvantage is only one user can access at a time, other user can

    wait until the first user complete the operation.

    CVSS, by using this many users can access concurrently. When compared to VSS, CVSS cost is high.

    6. What is the difference between clear log file and clear status file?

    Clear log--- we can clear the log details by using the DS Director. Under job menu clear log option is

    available. By using this option we can clear the log details of particular job.

    Clear status file---- lets the user remove the status of the record associated with all stages of selected jobs

    (in DS Director)

    7. I developed 1 job with 50 stages, at the run time one stage is missed how can u identify which

    stage is missing?

    By using usage analysis tool, which is available in DS manager, we can find out the what r the items r

    used in job.

    8. My job takes 30 minutes time to run, I want to run the job less than 30 minutes? What r the step

    we have to take?

    By using performance tuning aspects which are available in DS, we can reduce time.

    Tuning aspect

    In DS administrator : in-process and inter process

    In between passive stages : inter process stage

    OCI stage : Array size and transaction size

    And also use link partitioner & link collector stage in between passive stages

    9. How to do road transposition in DS?

    Pivot stage is used to transposition purpose. Pivot is an active stage that maps sets of columns in an inpu

    table to a single column in an output table.

  • 7/30/2019 dw FAQ

    30/77

    10. If a job locked by some user, how can you unlock the particular job in DS?

    We can unlock the job by using clean up resources option which is available in DS Director. Other wis

    we can find PID (process id) and kill the process in UNIX server.

    11. What is a container? How many types containers are available? Is it possible to use container a

    look up?

    A container is a group of stages and links. Containers enable you to simplify and modularize your serve

    job designs by replacing complex areas of the diagram with a single container stage.

    DataStage provides two types of container:

    Local containers. These are created within a job and are only accessible by that job only.

    Shared containers. These are created separately and are stored in the Repository in the same way tha

    jobs are. Shared containers can use any job in the project.

    Yes we can use container as look up.

    12. How to deconstruct the shared container?

    To deconstruct the shared container, first u have to convert the shared container to local container. And

    then deconstruct the container.

    13. I am getting input value like X = Iconv(31 DEC 1967,D)? What is the X value?

    X value is Zero.

    Iconv Function Converts a string to an internal storage format.It takes 31 dec 1967 as zero and counts days

    from that date(31-dec-1967).

    14. What is the Unit testing, integration testing and system testing?

    Unit testing: As for Ds unit test will check the data type mismatching,

    Size of the particular data type, column mismatching.

    Integration testing: According to dependency we will put all jobs are integrated in to one sequence. Tha

    is called control sequence.

  • 7/30/2019 dw FAQ

    31/77

    15. What are the command line functions that import and export the DS jobs?

    Dsimport.exe ---- To import the DataStage components

    Dsexport.exe ---- To export the DataStage components

    16. How many hashing algorithms are available for static hash file and dynamic hash file?

    Sixteen hashing algorithms for static hash file.

    Two hashing algorithms for dynamic hash file( GENERAL or SEQ.NUM)

    17. What happens when you have a job that links two passive stages together?

    Obviously there is some process going on. Under covers Ds inserts a cut-down transformer stage between

    the passive stages, which just passes data straight from one stage to the other.

    18. What is the use use of Nested condition activity?

    Nested Condition. Allows you to further branch the execution of a sequence depending on a condition.

    19. I have three jobs A,B,C . Which are dependent on each other? I want to run A & C jobs daily

    and B job runs only on Sunday. How can u do it?

    First you have to schedule A & C jobs Monday to Saturday in one sequence.

    Next take three jobs according to dependency in one more sequence and schedule that job only Sunday.

    TOP 10 FEATURES IN DATASTAGE HAWK

    The IILive2005 conference marked the first public presentations of the functionality in the WebSphere

    Information Integration Hawk release. Though it's still a few months away I am sharing my top Ten things

    I am looking forward to in DataStage Hawk:

    1) The metadata server. To borrow a simile from that judge on American Idol "Using MetaStage is kind o

    lik b thi i th ld i Y k it' d f b t th t d 't t it f

  • 7/30/2019 dw FAQ

    32/77

    metadata, and learn how the product works and write reports. Hawk brings the common repository an

    improved metadata reporting and we can get the positive effectives of bathing in sea water without the

    shrinkage that comes with it.

    2) QualityStage overhaul. Data Quality reporting can be another forgotten aspect of data integration

    projects. Like MetaStage the QualityStage server and client had an additional install, training and

    implementation overhead so many DataStage projects did not use it. I am looking forward to more

    integration projects using standardisation, matching and survivorship to improve quality once thes

    features are more accessible and easier to use.

    3) Frictionless Connectivity and Connection Objects. I've called DB2 every rude name under the sun. No

    because it's a bad database but because setting up remote access takes me anywhere from five minutes to

    five weeks depending on how obscure the error message and how hard it is to find the obscure setup step

    that was missed during installation. Anything that makes connecting to database easier gets a big tick from

    me.

    4) Parallel job range lookup. I am looking forward to this one because it will stop people asking for it on

    forums. It looks good, it's been merged into the existing lookup form and seems easy to use. Will b

    interested to see the performance.

    5) Slowly Changing Dimension Stage. This is one of those things that Informatica were able to trumpet a

    product comparisons, that they have more out of the box DW support. There are a few enhancements to

    make updates to dimension tables easier, there is the improved surrogate key generator, there is the slowly

    changing dimension stage and updates passed to in memory lookups. That's it for me with DBMS

    generated keys, I'm only doing the keys in the ETL job from now on! DataStage server jobs have the hash

    file lookup where you can read and write to it at the same time, parallel jobs will have the updateable

    lookup.

    6) Collaboration: better developer collaboration. Everyone hates opening a job and being told it is locked

    "Bloody whathisname has gone to lunch, locked the job and now his password protected screen saver is

    up! Unplug his PC!" Under Hawk you can open a readonly copy of a locked job plus you get told who has

    locked the job so you know whom to curse.

    7) Session Disconnection. Accompanied by the metallic cry of "exterminate! exterminate!" a

    administrator can disconnect sessions and unlock jobs.

    8) Improved SQL Builder. I know a lot of people cross the street when they see the SQL Builder coming

  • 7/30/2019 dw FAQ

    33/77

    list to avoid column mismatches. I am hoping the next version is more flexible and can build complex

    SQL.

    9) Improved job startup times. Small parallel jobs will run faster. I call it the death of a thousand cuts, you

    very large parallel job takes too long to run because a thousand smaller jobs are starting and stopping a

    the same time and cutting into CPU and memory. Hawk makes these cuts less painful.

    10) Common logging. Log views that work across jobs, log searches, log date constraints, wildcard

    message filters, saved queries. It's all good. You no longer need to send out a search party to find an erro

    message.

    That's my top ten. I am also hoping the software comes in a box shaped like a hawk and makes a hawk

    scream when you open it. A bit like those annoying greeting cards. Is there any functionality you think

    Hawk is missing that you really want to see?

  • 7/30/2019 dw FAQ

    34/77

    1.Display the dept information from department table

    select * from dept;

    2.Display the details of all employees

    select * from emp;

    3.Display the name and job for all employees

    select ename,job from emp;

    4.Display name and salary for all employees

    select ename,sal from emp;

    5.Display employee number and total salary for each employee

    select empno,sal+comm from emp;

    6.Display employee name and annual salary for all employees

    select empno,ename,12*sal+nvl(comm,0) annualsal from emp;

    7.Display the names of all employees who are working in department number 10

    select ename from emp where deptno = 10;

    8.Display the names of all employees working as clerks and drawing a salary more than 3000

    select ename from emp wher job = 'CLERK' and sal > 3000;

    9.Display employee number and names for employees who earn commission

    select empno,ename from emp where comm is not null and comm > 0;

    10.Display names of employees who do not earn any commission

    select empno,ename from emp where comm is null and comm = 0;

    11.Display the names of employees who are working as clerk , salesman or analyst and drawing a salary more

    than 3000

    select ename from emp where (job='CLERK' or job='SALESMAN' or job='ANALYST') and sal>3000;

    12.Display the names of employees who are working in the company for the past 5 years

    select ename from emp where sysdate - hiredate > 5*365;

    13.Display the list of employees who have joined the company before 30 th june 90 or after 31 st dec 90

    select * from emp where hiredate between '30-jun-1990' and '31-dec-1990';

    14.Display current date

    select sysdate from dual;

  • 7/30/2019 dw FAQ

    35/77

    15.Display the list of users in your database (using log table)

    select * from dba_users;

    16.Display the names of all tables from the current user

    select * from tab;

    17.Display the name of the current user

    show user;

    18.Display the names of employees working in department number 10 or 20 or 40 or employees working as

    clerks , salesman or analyst

    select ename from emp where deptno in (10,20,40) or job in ('CLERK','SALESMAN','ANALYST');

    19.Display the names of employees whose name starts with alphabet S

    select ename from emp where ename like 'S%';

    20.Display employee name from employees whose name ends with alphabet S

    select ename from emp where ename like '%S';

    21.Display the names of employees whose names have sencond alphabet A in their names

    select ename from emp where ename like '_S%';

    22.Display the names of employees whose name is exactly five characters in length

    select ename from emp where length(ename)=5;

    or

    select ename from emp where ename like '_____';

    23.Display the names of employees who are not working as managers

    select * from emp minus (select * from emp where empno in (select mgr from emp));

    or

    select * from emp where empno not in (select mgr from emp where mgr is not null);

    or

    select * from emp e where empno not in (select mgr from emp where e.empno=mgr);

    24.Display the names of employees who are not working as SALESMAN or CLERK or ANALYST

    select job from emp where job not in ('CLERK','ANALYST','SALESMAN');

    25.Display all rows from emp table. The system should wait after every screen full of information

    set pause on;

    26.Display the total number of employees working in the company

    select count(*) from emp;

  • 7/30/2019 dw FAQ

    36/77

    15.Display the list of users in your database (using log table)

    select * from dba_users;

    16.Display the names of all tables from the current user

    select * from tab;

    17.Display the name of the current user

    show user;

    18.Display the names of employees working in department number 10 or 20 or 40 or employees working as

    clerks , salesman or analyst

    select ename from emp where deptno in (10,20,40) or job in ('CLERK','SALESMAN','ANALYST');

    19.Display the names of employees whose name starts with alphabet S

    select ename from emp where ename like 'S%';

    20.Display employee name from employees whose name ends with alphabet S

    select ename from emp where ename like '%S';

    21.Display the names of employees whose names have sencond alphabet A in their names

    select ename from emp where ename like '_S%';

    22.Display the names of employees whose name is exactly five characters in length

    select ename from emp where length(ename)=5;

    or

    select ename from emp where ename like '_____';

    23.Display the names of employees who are not working as managers

    select * from emp minus (select * from emp where empno in (select mgr from emp));

    or

    select * from emp where empno not in (select mgr from emp where mgr is not null);

    or

    select * from emp e where empno not in (select mgr from emp where e.empno=mgr);

    24.Display the names of employees who are not working as SALESMAN or CLERK or ANALYST

    select job from emp where job not in ('CLERK','ANALYST','SALESMAN');

    25.Display all rows from emp table. The system should wait after every screen full of information

    set pause on;

    26.Display the total number of employees working in the company

    select count(*) from emp;

  • 7/30/2019 dw FAQ

    37/77

    27.Display the total salary and total commission to all employees

    select sum(sal), sum(nvl(comm,0)) from emp;

    28.Display the maximum salary from emp table

    select max(sal) from emp;

    29.Display the minimum salary from emp table

    select min(sal) from emp;

    30.Display the average salary from emp table

    select avg(sal) from emp;

    31.Display the maximum salary being paid to CLERK

    select max(sal) from emp where job='CLERK';

    32.Display the maximum salary being paid in dept no 20

    select max(sal) from emp where deptno=20;

    33.Display the minimum salary being paid to any SALESMAN

    select min(sal) from emp where job='SALESMAN';

    34.Display the average salary drawn by managers

    select avg(sal) from emp where job='MANAGER';

    35.Display the total salary drawn by analyst working in dept no 40

    select sum(sal)+sum(nvl(comm,0)) from emp where deptno=40;

    36.Display the names of employees in order of salary i.e. the name of the employee earning lowest salary

    shoud appear first

    select ename from emp order by sal;

    37.Display the names of employees in descending order of salary

    select ename from emp order by sal desc;

    38.Display the details from emp table in order of emp name

    select ename from emp order by ename;

    39.Display empnno,ename,deptno and sal. Sort the output first based on name and within name by deptno and

    witdhin deptno by sal;

    select * from emp order by ename,deptno,sal;

    40) Display the name of employees along with their annual salary(sal*12).

    the name of the employee earning highest annual salary should appear first?

    Ans:select ename,sal,sal*12 "Annual Salary" from emp order by "Annual Salary" desc;

  • 7/30/2019 dw FAQ

    38/77

    41) Display name,salary,Hra,pf,da,TotalSalary for each employee.

    The out put should be in the order of total salary ,hra 15% of salary ,DA 10% of salary .pf 5% salary Total

    Salary

    will be (salary+hra+da)-pf?

    Ans: select ename,sal SA,sal*0.15 HRA,sal*0.10 DA,sal*5/100 PF, sal+(sal*0.15)+(sal*0.10)-(sal*.05)

    TOTALSALARY

    from emp ORDER BY TOTALSALARY DESC;

    42) Display Department numbers and total number of employees working in each Department?

    Ans: select deptno,count(*) from tvsemp group by deptno;

    43) Display the various jobs and total number of employees working in each job group?

    Ans: select job,count(*) from tvsemp group by job;

    44)Display department numbers and Total Salary for each Department?

    Ans: select deptno,sum(sal) from tvsemp group by deptno;

    45)Display department numbers and Maximum Salary from each Department?

    Ans: select deptno,max(Sal) from tvsemp group by deptno;

    46)Display various jobs and Total Salary for each job?

    Ans: select job,sum(sal) from tvsemp group by job;

    47)Display each job along with min of salary being paid in each job group?

    Ans: select job ,min(sal) from tvsemp group by job;

    48) Display the department Number with more than three employees in each department?

    Ans: select deptno ,count(*) from tvsemp group by deptno having count(*)>3;

    49) Display various jobs along with total salary for each of the job where total salary is greater than 40000?

    Ans: select job,sum(sal) from tvsemp group by job having sum(SAl)>40000;

    50) Display the various jobs along with total number of employees in each job.The

    output should contain only those jobs with more than three employees?

    Ans: select job,count(*) from tvsemp group by job having count(*)>3;

    51) Display the name of employees who earn Highest Salary?

    Ans: select ename, sal from tvsemp where sal>=(select max(sal) from tvsemp );

    52) Display the employee Number and name for employee working as clerk and earning highest salary among

    the clerks?

    Ans: select ename,empno from tvsemp where sal=(select max(sal) from tvsemp where job='CLERK') and

    job='CLERK' ;

  • 7/30/2019 dw FAQ

    39/77

    53) Display the names of salesman who earns a salary more than the Highest Salary of the clerk?

    Ans: select ename,sal from tvsemp where sal>(select max(sal) from tvsemp where job='CLERK') AND

    job='SALESMAN';

    54) Display the names of clerks who earn a salary more than the lowest Salary of any salesman?

    Ans: select ename,sal from tvsemp where sal>(select min(sal) from tvsemp where job='SALESMAN') and

    job='CLERK';

    55) Display the names of employees who earn a salary more than that of jones or that of salary greater than tha

    of scott?

    Ans: select ename,sal from tvsemp where sal>all(select sal from tvsemp where ename='JONES' OR

    ename='SCOTT');

    56) Display the names of employees who earn Highest salary in their respective departments?

    Ans: select ename,sal,deptno from tvsemp where sal in (select max(sal) from tvsemp group by deptno);

    57) Display the names of employees who earn Highest salaries in their respective job Groups?

    Ans: select ename,job from tvsemp where sal in (select max(sal) from tvsemp group by job);

    58) Display employee names who are working in Accounting department?

    Ans: select e.ename,d.dname from emp e,dept d where e.deptno=d.deptno and d.dname='ACCOUNTING';

    59) Display the employee names who are Working in Chicago?

    Ans: select e.ename,d.loc from emp e,tvsdept d where e.deptno=d.deptno and d.loc='CHICAGO';

    60) Display the job groups having Total Salary greater than the maximum salary for Managers?

    Ans: select job ,sum(sal) from tvsemp group by job having sum(sal) >(select max(sal) from tvsemp where

    job='MANAGER');

    61) Display the names of employees from department number 10 with salary greater than that of ANY

    employee working in other departments?

    Ans: select ename,deptno from tvsemp where sal>any(select min(sal) from tvsemp where deptno!=10 group by

    deptno) and deptno=10 ;

    62) Display the names of employees from department number 10 with salary greater than that of ALL

    employee working in other departments?

  • 7/30/2019 dw FAQ

    40/77

    Ans: select ename,deptno from tvsemp where sal>all(select max(sal) from tvsemp where deptno!=10 group by

    deptno) and deptno=10 ;

    63) Display the names of mployees in Upper Case?

    Ans: select upper(ename) from tvsemp;

    64) Display the names of employees in Lower Case?

    Ans: select Lower(ename) from tvsemp;

    65) Display the names of employees in Proper case?

    Ans: select InitCap(ename)from tvsemp;

    Q:66) Find the length of your name using Appropriate Function?

    Ans: select lentgh('RAMA') from dual;

    67) Display the length of all the employee names?

    Ans: select length(ename) from tvsemp;

    68) Display the name of employee Concatinate with Employee Number?

    Ans: select ename||' '||empno from tvsemp;

    69) Use appropriate function and extract 3 characters starting from 2 characters from the following string

    'Oracle' i.e., the out put should be ac?

    Ans: select substr('Oracle',3,2) from dual;

    70) Find the first occurance of character a from the following string Computer Maintenance Corporation?

    Ans: select lstr('Computer Maintenance Corporation','a' ) from dual;

    71) Replace every occurance of alphabet A with B in the string .Alliens (Use Translate function)?

    Ans: select translate('Alliens','A','B') from Dual;

    72) Display the information from the employee table . where ever job Manager is found it should be displayed

    as Boss?

    Ans: select ename ,replace(job,'MANAGER','BOSS') from tvsemp;

    73) Display empno,ename,deptno from tvsemp table. Instead of display department numbers

    display the related department name(Use decode function)?

    Ans: select empno,ename,deptno,Decode(deptno,10,'ACCOUNTING'

    ,20,'RESEARCH',30,'SALES','OPERATIONS')DName from tvsemp;

    74) Display your Age in Days?

    Ans: select sysdate-to_date('30-jul-1977') from dual;

  • 7/30/2019 dw FAQ

    41/77

    75) Display your Age in Months?

    Ans: select months_between(sysdate,to_date('30-jul-1977')) from dual;

    76) Display current date as 15th August Friday Nineteen Nienty Seven?

    Ans: select To_char(sysdate,'ddth Month Day year') from dual;

    77) Display the following output for each row from tvsemp table?

    Ans: Q:78

    78) Scott has joined the company on 13th August ninteen ninety?

    Ans: select empno,ename,to_char(Hiredate,'Day ddth Month year') from tvsemp;

    79) Find the nearest Saturday after Current date?

    Ans: select next_day(sysdate,'Saturday') from dual;

    80) Display the current time?

    Ans: select To_Char(sysdate,'HH:MI:SS') from dual;

    81) Display the date three months before the Current date?

    Ans: select Add_months(sysdate,-3) from dual

    82) Display the common jobs from department number 10 and 20?

    Ans: select job from tvsemp where job in (select job from tvsemp where deptno=20) and deptno=10;

    83) Display the jobs found in department 10 and 20 Eliminate duplicate jobs?

    Ans: select Distinct job from tvsemp where deptno in(10,20);

    84) Display the jobs which are unique to department 10?

    Ans: select job from tvsemp where deptno=10;

    85) Display the details of those employees who do not have any person working under him?

    Ans: select empno,ename,job from tvsemp where empno not in (select mgr from tvsemp where mgr is not

    null );

    86) Display the details of those employees who are in sales department and grade is 3?

    Ans: select e.ename,d.dname,grade from emp e,dept d ,salgrade where e.deptno=d.deptno and dname='SALES

    and grade=3;

    87) Display thoes who are not managers?

    Ans: select ename from tvsemp where job!='MANAGER';

    88) Display those employees whose name contains not less than 4 characters?

    Ans: select ename from tvsemp where length(ename)>=4

  • 7/30/2019 dw FAQ

    42/77

    89) Display those department whose name start with"S" while location name ends with "K"?

    Ans: select e.ename,d.loc from tvsemp e ,tvsdept d where d.loc like('%K') and ename like('S%')

    90) Display those employees whose manager name is Jones?

    Ans: select e.ename Superior,e1.ename Subordinate from tvsemp e,e1 where e.empno=e1.mgr and

    e.ename='JONES';

    91) Display those employees whose salary is more than 3000 after giving 20% increment?

    Ans: select ename,sal,(sal+(sal*0.20)) from tvsemp where (sal+(sal*0.20))>3000;

    92) Display all employees with their department names?

    Ans: select e.ename,d.dname from tvsemp e, tvsdept d where e.deptno=d.deptno

    93) Display ename who are working in sales department?

    Ans: select e.ename,d.dname from emp e,dept d where e.deptno=d.deptno and d.dname='SALES';

    94) Display employee name,dept name,salary,and commission for those sal in between 2000

    to 5000 while location is Chicago?

    Ans: Select e.ename,d.dname,e.sal,e.comm from tvsemp e,dept d where e.deptno=d.deptno and sal between

    2000 and 5000;

    95) Display those employees whose salary is greater than his managers salary?

    Ans: Select e.ename,e.sal,e1.ename,e1.sal from tvsemp e,e1 where e.mgr=e1.empno and e.sal>e1.sal;

    96) Display those employees who are working in the same dept where his manager is work?

    Ans: select e.ename,e.deptno,e1.ename,e1.deptno from tvsemp e,e1 where e.mgr=e1.empno and

    e.deptno=e1.deptno;

    97) Display those employees who are not working under any Manager?

    Ans: select ename from tvsemp where mgr is null;

    98) Display the grade and employees name for the deptno 10 or 30 but grade is not 4 while

    joined the company before 31-DEC-82?

    Ans: select ename,grade,deptno,sal from tvsemp ,salgrade where ( grade,sal) in

    ( select grade,sal from salgrade,tvsemp where sal between losal and hisal)

    and grade!=4 and deptno in (10,30) and hiredate

  • 7/30/2019 dw FAQ

    43/77

    99) Update the salary of each employee by 10% increment who are not eligible for commission?

    Ans: update tvsemp set sal= (sal+(sal*0.10)) where comm is null;

    100) Delete those employees who joined the company before 31-Dec-82 while their department Location is

    New York or Chicago?

    Ans: select e.ename,e.hiredate,d.loc from tvsemp e,tvsdept d where

    e.deptno=d.deptno and hiredate

  • 7/30/2019 dw FAQ

    44/77

    106) Display employee name,job abd his manager .Display also employees who are with out

    managers?

    Ans: select e.ename ,e1.ename,e.job,e.sal,d.dname from tvsemp e,e1,tvsdept d where e.mgr=e1.empno(+) and

    e.deptno=d.deptno

    107) Display Top 5 employee of a Company?

    Ans:

    108) Display the names of those employees who are getting the highest salary?

    Ans: select ename,sal from tvsemp where sal in (select max(sal) from tvsemp)

    109) Display those employees whose salary is equal to average of maximum and minimum?

    Ans: select * from tvsemp

    where sal=(select (max(sal)+min(sal))/2 from tvsemp)

    110) Select count of employees in each department where count >3?

    Ans: select count(*) from tvsemp group by deptno having count(*)>3

    111) Display dname where atleast three are working and display only deptname?

    Ans: select d.dname from tvsdept d, tvsemp e where e.deptno=d.deptno group by d.dname having count(*)>3;

    112) Display name of those managers name whose salary is more than average salary of

    Company?

    Ans: select distinct e1.ename,e1.sal from tvsemp e,e1,dept d where e.deptno=d.deptno and e.mgr=e1.empno

    and e1.sal> (select avg(sal) from tvsemp);

    113) Display those managers name whose salary is more than average salary salary of his

    employees?

    Ans: select distinct e1.ename,e1.sal from tvsemp e,e1,dept d where e.deptno=d.deptno and e.mgr=e1.empno

    and e1.sal>any (select avg(sal) from tvsemp group by deptno);

    114) Display employee name,sal,comm and netpay for those employees whose netpay is

    greater than or equal to any other employee salary of the company?

    Ans: select ename,sal,NVL(comm,0),sal+NVL(comm,0) from tvsemp where

    sal+NVL(comm,0) >any (select e.sal from