October 16, 2018

Sreekanth B

SSIS Most Frequently Asked Interview Questions Answers

Explain What Is Ssis?

SSIS or SQL Server Integration Services (SSIS) is a component of Microsoft SQL Server, which can be used to accomplish a broad range of data migration tasks.

Explain What Is A Checkpoint In Ssis?

Checkpoint in SSIS allows the project to restart from the point of failure. Checkpoint file stores the information about the package execution, if the package run successfully the checkpoint file is deleted or else it will restart from the point of failure.

Explain What Is Connection Managers In Ssis?

While gathering data from different sources and writing it to a destination, connection managers are helpful.  Connection manager facilitates the connection to the system that include information’s like data provider information, server name, authentication mechanism, database name, etc.

Explain What Is Ssis Breakpoint?

A breakpoint enables you to pause the execution of the package in business intelligence development studio during troubleshooting or development of an SSIS package.

Explain What Is Event Logging In Ssis?

In SSIS, event logging allows you to select any specific event of a task or a package to be logged. It is very helpful when you are troubleshooting your package to understand the performance package.
SSIS Most Frequently Asked Interview Questions Answers
SSIS Most Frequently Asked Interview Questions Answers

Explain What Is Logging Mode Property?

SSIS packages and all the associated tasks have a property called LoggingMode.   This property accepts three possible values

Disabled: To enable logging of the component

Enabled: To disable logging of the component

UseParentSetting: To use parent’s setting of the component

Explain What Is A Data Flow Buffer?

SSIS operates using buffers; it is a kind of an in-memory virtual table to hold data.

For What Data Checkpoint Data Is Not Saved?

Checkpoint data is not saved for ForEach Loop and ForLoop containers.

Mention What Are The Important Components Of Ssis Package?

The important component in SSIS package are

Data flow
Control flow
Package explorer
Event handler

Explain What Is Solution Explorer In Ssis?

Solution Explorer in SSIS Designer is a screen where you can view and access all the data sources, data sources views, projects, and other miscellaneous files.

Explain What Does It Mean By Data Flow In Ssis?

Data flow in SSIS is nothing but the flow of data from the corresponding sources to the target destinations.

Define What Is “task” In Ssis?

Task in SSIS is a very much similar to the method of any programming language that represents or carries out an individual unit of work.  Tasks are categorized into two categories

Control Flow Tasks
Database Maintenance Tasks

Explain What Is Ssis Package?

A package in SSIS is an organized collection of connections like data flow elements, control events, event handlers, parameters, variables, and configurations. You assemble them by either building it programmatically or by graphical design tools that SSIS provides.

Explain What Is A Container? How Many Types Of Containers Are There In Ssis?

In SSIS, a container is a logical grouping of tasks, and it allows to manage the scope of a task together.

Types of containers in SSIS are

Sequence container
For loop container
Foreach loop container
Task host container

Explain What Is Precedence Constraint In Ssis?

Precedence Constraint in SSIS enables you to define the logical sequence of tasks in the order they should be executed.  You can connect all the tasks using connectors- Precedence Constraints.

Explain What Variables In Ssis And What Are The Types Of Variables In Ssis?

Variable in SSIS is basically used to store values.  In SSIS, there are two types of variables system variable and user variable.

Explain What Is Conditional Split Transactions In Ssis?

Conditional split transformation in SSIS is just like IF condition, which checks for the given condition based on the condition evaluation.

List Out The Different Types Of Data Viewers In Ssis?

Different types of data viewers in SSIS include

Grid
Histogram
Scatter Plot
Column Chart

Mention How Would You Deploy An Ssis Package On Production?

To deploy SSIS package we need to execute the manifest files and need to determine whether to deploy this into File System or onto SQL Server.  Alternatively you can also import package from SSMS from SQL Server or File System.

Explain How To Handle Early Arriving Facts Or Late Arriving Dimension?

Late Arriving Dimension are unavoidable, to handle these we can create a dummy dimensions with natural/business key and keep the rest of the attributes as null or default. So when actual dimension arrives, the dummy dimension is updated with Type 1 change. This is also referred as Inferred Dimensions.

Explain How Can You Do An Incremental Load?

The best and fastest way to do incremental load is by using Timestamp column in the source table and storing the last ETL timestamp.

How Would You Do Logging In Ssis?

Logging Configuration provides an inbuilt feature which can log the detail of various events like onError, onWarning etc to the various options say a flat file, SqlServer table, XML or SQL Profiler.

Mention What Are The Possible Locations To Save Ssis Package?

You can save SSIS package at

SQL Server
Package Store
File System

What Will Be Your First Approach If The Package That Runs Fine In Business Intelligence Development Studio (bids) But Fails When Running From An Sql Agent Job?

The account that runs SQL Agent Jobs might not have the required permission for one of the connections in your package. In such cases, either you can create a proxy account or elevate the account permissions.

Explain What Is The Role Of Event Handlers Tab In Ssis?

On the event handlers tab, workflows can be configured to respond to package events.  For instance, you can configure workflow when any task stops, fails or starts.

Explain How You Can Notify The Staff Members About Package Failure?

Either inside the package you could add a Send Mail Task in the event handlers, or you can even set notification in the SQL Agent when the package runs.

How Can An Ssis Package Be Scheduled To Execute At A Defined Time Or At A Defined Interval Per Day?

You can configure a SQL Server Agent Job with a job step type of SQL Server Integration Services Package, the job invokes the dtexec command line utility internally to execute the package. You can run the job (and in turn the SSIS package) on demand or you can create a schedule for a one time need or on a reoccurring basis. Refer to this tip to learn more about it.

How Would You Do Error Handling?

A SSIS package could mainly have two types of errors

a) Procedure Error: Can be handled in Control flow through the precedence control and redirecting the execution flow.

b) Data Error: is handled in DATA FLOW TASK buy redirecting the data flow using Error Output of a component.

How To Pass Property Value At Run Time? How Do You Implement Package Configuration?

A property value like connection string for a Connection Manager can be passed to the pkg using package configurations.Package Configuration provides different options like XML File, Environment Variables, SQL Server Table, Registry Value or Parent package variable.

If You Want To Send Some Data From Access Database To Sql Server Database. What Are Different Component Of Ssis Will You Use?

In the data flow, we will use one OLE DB source, data conversion transformation and one OLE DB destination or SQL server destination. OLE DB source is data source is useful for reading data from Oracle, SQL Server and Access databases. Data Conversion transformation would be needed to remove datatype abnormality since there is difference in datatype between the two databases (Access and SQL Server) mentioned. If our database server is stored on and package is run from same machine, we can use SQL Server destination otherwise we need to use OLE DB destination. The SQL Server destination is the destination that optimizes the SQL Server.

What Is Sql Server Integration Services (ssis)?

SQL Server Integration Services (SSIS) is component of SQL Server 2005 and later versions. SSIS is an enterprise scale ETL (Extraction, Transformation and Load) tool which allows you to develop data integration and workflow solutions. Apart from data integration, SSIS can be used to define workflows to automate updating multi-dimensional cubes and automating maintenance tasks for SQL Server databases.

How Does Ssis Differ From Dts?

SSIS is a successor to DTS (Data Transformation Services) and has been completely re-written from scratch to overcome the limitations of DTS which was available in SQL Server 2000 and earlier versions. A significant improvement is the segregation of the control/work flow from the data flow and the ability to use a buffer/memory oriented architecture for data flows and transformations which improve performance.

What Is The Control Flow?

When you start working with SSIS, you first create a package which is nothing but a collection of tasks or package components. The control flow allows you to order the workflow, so you can ensure tasks/components get executed in the appropriate order.

What Is The Data Flow Engine?

 The Data Flow Engine, also called the SSIS pipeline engine, is responsible for managing the flow of data from the source to the destination and performing transformations (lookups, data cleansing etc.).  Data flow uses memory oriented architecture, called buffers, during the data flow and transformations which allows it to execute extremely fast. This means the SSIS pipeline engine pulls data from the source, stores it in buffers (in-memory), does the requested transformations in the buffers and writes to the destination. The benefit is that it provides the fastest transformation as it happens in memory and we don't need to stage the data for transformations in most cases.

What Is Execution Tree?

Execution trees demonstrate how package uses buffers and threads. At run time, the data flow engine breaks down Data Flow task operations into execution trees. These execution trees specify how buffers and threads are allocated in the package. Each tree creates a new buffer and may execute on a different thread. When a new buffer is created such as when a partially blocking or blocking transformation is added to the pipeline, additional memory is required to handle the data transformation and each new tree may also give you an additional worker thread.

Difference Between Union All And Merge Join?

a) Merge transformation can accept only two inputs whereas Union all can take more than two inputs

b) Data has to be sorted before Merge Transformation whereas Union all doesn't have any condition like that.

How Would You Restart Package From Previous Failure Point?what Are Checkpoints And How Can We Implement In Ssis?

When a package is configured to use checkpoints, information about package execution is written to a checkpoint file. When the failed package is rerun, the checkpoint file is used to restart the package from the point of failure. If the package runs successfully, the checkpoint file is deleted, and then re-created the next time that the package is run.

Where Are Ssis Package Stored In The Sql Server?

MSDB.sysdtspackages90 stores the actual content and ssydtscategories, sysdtslog90, sysdtspackagefolders90, sysdtspackagelog, sysdtssteplog, and sysdtstasklog do the supporting roles.

Difference Between Asynchronous And Synchronous Transformations?

Asynchronous transformation have different Input and Output buffers and it is up to the component designer in an Async component to provide a column structure to the output buffer and hook up the data from the input.

How To Achieve Parallelism In Ssis?

Parallelism is achieved using MaxConcurrentExecutable property of the package. Its default is -1 and is calculated as number of processors + 2.

How Do You Do Incremental Load?

Fastest way to do incremental load is by using Timestamp column in source table and then storing last ETL timestamp, In ETL process pick all the rows having Timestamp greater than the stored Timestamp so as to pick only new and updated records.

How To Handle Late Arriving Dimension Or Early Arriving Facts.?

Late arriving dimensions sometime get unavoidable 'coz delay or error in Dimension ETL or may be due to logic of ETL. To handle Late Arriving facts, we can create dummy Dimension with natural/business key and keep rest of the attributes as null or default.  And as soon as Actual dimension arrives, the dummy dimension is updated with Type 1 change. These are also known as Inferred Dimensions.

What Is A Transformation?

A transformation simply means bringing in the data in a desired format. For example you are pulling data from the source and want to ensure only distinct records are written to the destination, so duplicates are  removed.  Another example is if you have master/reference data and want to pull only related data from the source and hence you need some sort of lookup. There are around 30 transformation tasks available and this can be extended further with custom built tasks if needed.

How Can You Configure Your Ssis Package To Run In 32-bit Mode On 64-bit Machine When Using Some Data Providers Which Are Not Available On The 64-bit Platform?

In order to run an SSIS package in 32-bit mode the SSIS project property Run64BitRuntime needs to be set to "False".  The default configuration for this property is "True".  This configuration is an instruction to load the 32-bit runtime environment rather than 64-bit, and your packages will still run without any additional changes. The property can be found under SSIS Project Property Pages -> Configuration Properties -> Debugging.

How Is Ssis Runtime Engine Different From The Ssis Dataflow Pipeline Engine?

The SSIS Runtime Engine manages the workflow of the packages during runtime, which means its role is to execute the tasks in a defined sequence.  As you know, you can define the sequence using precedence constraints. This engine is also responsible for providing support for event logging, breakpoints in the BIDS designer, package configuration, transactions and connections. The SSIS Runtime engine has been designed to support concurrent/parallel execution of tasks in the package.

The Dataflow Pipeline Engine is responsible for executing the data flow tasks of the package. It creates a dataflow pipeline by allocating in-memory structure for storing data in-transit. This means, the engine pulls data from source, stores it in memory, executes the required transformation in the data stored in memory and finally loads the data to the destination. Like the SSIS runtime engine, the Dataflow pipeline has been designed to do its work in parallel by creating multiple threads and enabling them to run multiple execution trees/units in parallel.

What Is A Task?

A task is very much like a method of any programming language which represents or carries out an individual unit of work. There are broadly two categories of tasks in SSIS, Control Flow tasks and Database Maintenance tasks. All Control Flow tasks are operational in nature except Data Flow tasks. Although there are around 30 control flow tasks which you can use in your package you can also develop your own custom tasks with your choice of .NET programming language.

What Is A Precedence Constraint And What Types Of Precedence Constraint Are There?

SSIS allows you to place as many as tasks you want to be placed in control flow. You can connect all these tasks using connectors called Precedence Constraints. Precedence Constraints allow you to define the logical sequence of tasks in the order they should be executed. You can also specify a condition to be evaluated before the next task in the flow is executed.

These are the types of precedence constraints and the condition could be either a constraint, an expression or both Success (next task will be executed only when the last task completed successfully) or Failure (next task will be executed only when the last task failed) or Complete (next task will be executed no matter the last task was completed or failed).

What Is A Container And How Many Types Of Containers Are There?

A container is a logical grouping of tasks which allows you to manage the scope of the tasks together.

These are the types of containers in SSIS

Sequence Container - Used for grouping logically related tasks together

For Loop Container - Used when you want to have repeating flow in package

For Each Loop Container - Used for enumerating each object in a collection; for example a record set or a list of files.

Apart from the above mentioned containers, there is one more container called the Task Host Container which is not visible from the IDE, but every task is contained in it (the default container for all the tasks).

What Are Variables And What Is Variable Scope?

A variable is used to store values. There are basically two types of variables, System Variable (like ErrorCode, ErrorDescription, PackageName etc) whose values you can use but cannot change and User Variable which you create, assign values and read as needed. A variable can hold a value of the data type you have chosen when you defined the variable.

Variables can have a different scope depending on where it was defined. For example you can have package level variables which are accessible to all the tasks in the package and there could also be container level variables which are accessible only to those tasks that are within the container.

What Is An Ssis Proxy Account And Why Would You Create It?

When we try to execute an SSIS package from a SQL Server Agent Job it fails with the message "Non-SysAdmins have been denied permission to run DTS Execution job steps without a proxy account". This error message is generated if the account under which SQL Server Agent Service is running and the job owner is not a sysadmin on the instance or the job step is not set to run under a proxy account associated with the SSIS subsystem. Refer to this tip to learn more about it.

Subscribe to get more Posts :