Data integration and data transformation solutions can be built using SQL Server Integration Services. By copying or downloading files, loading data warehouses, cleaning, and mining data, and managing SQL Server objects and data, Integration Services can solve complex business problems. We have provided 25 interview questions and answers for the same. Let’s continue.
1. What Is SSIS?
SSIS was initially introduced with SQL Server 2005, which was the next generation of SQL Server software after SQL Server 2000. No doubt, SSIS is an ETL tool that handles data Extraction, Transformation, and load. Moreover, it can handle more tasks besides the ETL tasks such as data profiling, file system manipulation, etc. So, if you know C# .net or VB.net, you can use the script task to perform much more efficient batch operations.
2. What Is SSIS Control Flow Integration?
Control flows of SSIS let you program graphically how the tasks will run by using the logical connectors between tasks. In total, there are three basic logical connectors that you can use: success, failure, or complete. Also, you can use the FX (expression) to handle more complex conditions in your control flow.
3. What Is Data Transformation?
This data transformation stage applies a series of rules or functions to the extracted data from the source to derive the data for loading into the end target.
4. What Kind Of Variables Can You Create?
No doubt, you surely can create global variables and task-level variables in SSIS. For developers and programmers, these variables are the same as global and function level variables. Global variables are all available for all tasks across the entire job. All the variables, which are created in tasks are only available within that task.
5. What Types Of Containers Can You Use With SSIS Packages?
Totally, there are three types of containers i.e. loops, sequences, and for each loop.
The first of them includes a sequence container, which is a simple way to group similar tasks together. You can think of this sequence container as an organization container for more complex SSIS packages.
Moreover, a “for loop” container is what you can use to execute your tasks a certain number of times. For example, if you need to update records ten times, you can place the task that updates the records inside this for loop container and specifies 10 as the end of the loops. By making use of the for loop container, you don’t have to create either ten different packages to do the same task or have to run the whole package ten times when you schedule your job.
This for each loop container will prove useful when you don’t know ahead of time how many times a task should perform. For instance, let’s say that you want to delete all the files inside a folder, but you don’t know how many files are there at any particular time. By making use of each loop, it can go through the collection of files and delete them for you after the collection is emptied out. Also, it knows when it should stop.
6. How To Create The Deployment Utility?
The deployment is an entire process in which packages convert from development mode into executables mode. To deploy the SSIS package, you can directly deploy the package by right-clicking the Integration Services project and building it.
This all will be saved in the package.dtsx file on the projectbin folder. Moreover, you can also create the deployment utility using which the package can be deployed at either SQL Server or as a file on any location.
For creating a deployment utility, follow these steps:
- First, you should right-click on the project and click on properties.
- Select “True” for creating the DeploymentUtiltiy Option. You can also set deployment paths for the same.
- Now, you can close the window after making the changes and build the project by right-clicking on the project.
- You will see a deployment folder, which will be created in the BIN folder of your main project location.
- Inside this folder (deployment), you will find the .manifest file, double-clicking on it you can get options to deploy the package on SQL Server.
- You can now log in to SQL Server and check in MSDB on Integration Services.
7. What Is The Manifest File In SSIS?
Manifest files are the utilities, which can be used to deploy the package using the wizard on the file system and SQL Server database.
8. What Is File System Deployment?
In short, file system deployment means saving package files on a local or network drive. Later on, you can use the SQL Agent job to schedule when the packages will run.
9. How Can You Back Up Or Retrieve The SSIS Packages?
You can back up the MSDB database as all the packages on the SQL server deploy at MSDB especially if your package is deployed on SQL Server,
10. What Is The Data Flow Task In SSIS?
Basically, the data flow task is the pumping mechanism, which moves data from source to destination. But, in the case of SSIS, you have much more control over what happens from start to finish. No doubt, you have a set of out-of-the-box transformation components that you snap together to clean and manipulate the data while it is in the data pipeline.
In the same way control flow handles the main workflow of the package, the same is followed by the Data Flow, which handles the transformation of data. Anything manipulating the data falls into Data Flow Category.
11. What Is The Data Profiling Task?
Shortly, data profiling is a process of analyzing the source data to better understand what condition the data is in. This is in terms of cleanliness, patterns, numbers or nulls, and so on. Moreover, the data profiling tasks usually be used at the beginning of the development cycle to support the design of the destination database schema. Please be noted that this task is not used when you develop the normal recurring ETL packages.
12. What Is The Multicast Transformation In SSIS?
Multicast transform, as we can see from the name, can send single data input to multiple output paths easily. You could be looking to use this transformation to send a path to multiple destinations sliced in different ways. Multicast transformation, no doubt, is as same as the Split Transformation because both send data to multiple outputs. Yes but in that case, you are unable to mentioned and specify the conditions for which part of the data will be in which output in the Multicast transformation.
13. Explain The Difference Between Merge And Union All?
Merge transformation, in short, can perfectly merge data from two paths into a single output. Transform proves useful especially when you want to break out your Data Flow into a path that handles certain errors and then merges it back into the main Data Flow downstream after the errors have been handled. It is also useful if you wish to merge data from two Data Sources.
Please be noted that the data must be sorted before using the Merge Transformation. Thiis you can do simply by using the sort transformation prior to the merge or by specifying an ORDER By clause in the source connection. Moreover, the metadata must be the same for both paths. Like for example, the Customer column cannot be a numeric column in one path and a character column in the other path.
You can union all of the transformations works much in the same way as the Merge Transformation, but it does not require the data to be sorted. Normally, it takes the outputs from multiple sources or transforms and combines them into a single result set.
14. OLE DB Command Transform?
TOLE DB Command Transforms are the components designed to execute a SQL Statement for each row in an input stream. These tasks are analogous to an ADO Command Object being created, prepared, and executed for each row of a result set. Input streams provide mention and specify that data for parameters, which can be set into the SQL Statement that is either an Inline statement or a stored procedure call.
15. Execute Package Task?
Execute Package Task in simple enables you to build SSIS solutions called parent packages that execute other packages called as Child Packages. You will be finding this capability an indispensable part of your SSIS development as your packages begin to grow.
Moreover, separating these packages into discrete functional workflows for shorter development and testing cycles also facilitates best development practices. Normally, in SSIS, child packages are aware of the parent package parameters and can reach out and access those parameters–level configuration values. The majority of the pre-configurable properties are in the Package tab of the Executable package Task Editor.
A very first option is to provide the location of the child package. Options here include either the File system and SQL Server. You can easily deploy the SSIS packages in the FIle system task as a .dtsx file or within the msdb databases of SQL Server instances. In case, if you select a file system, you must first create a new connection manager connection to the child package.
If child packages are located in a SQL Server, you will need to provide the OLE DB Connection Manager for the SQL Server that holds your packages. In both cases, you must browse to and then select any child package within the connection to set the package to execute in the task.
16. What Is A Transaction In The SSIS Package And How To Implement It?
SSIS packages use transactions to bind the database actions that tasks perform into atomic units. By doing this, data integrity is maintained. The Al MS is container types include packages for the For loops, For Each loops, and Sequence containers, and the tasks hosts, which encapsulate each task and they all can be configured to use transactions. Moreover, IS also provides three to four options for configuring transactions: Not supported, Supported, and Required.
- First option i.e. Require indicates that container initiates a successful transaction unless one is already started by its parent container. If any of the transactions already exist, the container joins the transaction. Like for example, if a package, which is not configured to support transactions includes a Sequence container (that uses the Required option), the Sequence Container would initiate its own transactions. If packages were previously configured to use the Required options, the Sequence container would join the package transaction.
- Second option i.e. Supported clearly indicates that container does not start a transaction, but joins any transaction started by its parent container. Like for example, if a package with four Executable SQL tasks starts a transaction and all four tasks use the Supported options, the database perfectly updates performed by the Execute SQL tasks are rolled back if any task fails. If any of the packages does not start a transaction, the four execute SQL tasks are not at all bound by a transaction. Also, no database updates should exist except the ones performed by the failed task are rolled back.
- Not Supported i.e. last option clearly indicates that container does not initiate any transaction or join to an existing one. Transactions started by a parent container do not affect child containers that have been configured to Not Support transactions. For instance, if the package is configured to start a transaction and a For Loop Container in the package uses the NotSupported option, none of these tasks in the For Loop can roll back if they fail.
17. Explain The Difference Between executing T-SQL Task And Executing SQL Task?
In system of SSIS, there is a single task i.e. Execute T-SQL. This task is as same as Execute SQL task. Let’s see the difference between these two.
Executing SQL Tasks:
- Pros: They take less memory, faster performance
- Cons: They output into variables not supported. They only support ADO.net connection
Executing SQL Tasks:
- Pros: Supporting output into variables and multiple types of connection, parameterized query possible.
- Cons: Taking more memory, slower performance compared to the TSQL task.
18. Precedence Constraints?
Tasks are only executed if the condition that is set by the precedence constraint preceding the task is met. By making use of these constraints, it will choose different execution paths depending on the success or failure of other tasks.
- Success Path: Workflow will proceed in this path when the preceding container executes successfully. It is clearly mentioned in the control flow (by a solid green line).
- Failure Path: This path lets workflow to proceed when the preceding container’s execution results in a failure. It is clearly mentioned in the control flow (by a solid red line).
- Completion Path: Workflow will move ahead with this path when the preceding container’s execution completes, regardless of success or failure. It is clearly mentioned in the control flow (by a solid blue line).
- Path for Expression/Constraint with logical AND: Workflow will move ahead when specified expression and constraints evaluate to true. It is clearly mentioned in control flow (by a solid color line) along with a small ‘fx’ icon next to it. Line’s color depends on logical constraint chosen (e.g. success = green, completion = blue).
19. What Is The Benefit Of A Config File In SSIS?
Configuring file in SSIS is used to provide inputs to connection manager different properties that package use to assign values at runtime dynamically.
By making the use of this configuration file, users need to make changes to the configuration file which package will take automatically at runtime because of using it you don’t need each time to introduce required changes in the packages every in case you are deploying package on multiple servers or locations. Also, there are lots of ways in which configuration values can be stored. XML config files: They store the configuration file as an XML file.
- Environment variables – They store the config in one of the environment variables.
- Registry entries – They store the config in the registry
- Parent package variables – They store the config as a variable in the package that contains the tasks.
- SQL Server- They store the config in a table in SQL Server
20. Explain The Difference Between Control Flow And Data Flow?
- Control flow is used to design the flow of the package. Data flows are best for ETL processes.
- Data Flows are the subsets of control flow
- You will find only one control flow while multiple dataflow can exists.
- Data flows cannot work without a control flow
All of the process-base tasks are part of control flow while ETL related tasks are part of Dataflow which is again a subset of control flow.
21. Checkpoint?
Checkpoints are a property in SSIS which enables the project to restart from the point of failure. When you are setting the property to true package create the checkpoint file, and it stores the information about package execution and uses it to restart the package from the point of failure. If packages run successfully, checkpoint files are deleted, and then re-created the next time the package runs.
22. Can I Run Ssis Packages With SQL Server Express Or Web Or Workgroup Editions?
You might need dtexec to run SSIS packages from the command line.
In old versions of SQL2005 Express:
- You got dtexec and necessary tools with SQL SERVER 2005 EXPRESS EDITION HAVING ADVANCED SERVICES
- MICROSOFT SQL SERVER 2005 EXPRESS EDITION TOOLKIT
But, having no SQL Agent can let you use other scheduling methods.
23. How Can I Fail A Package (Manually) In Integration Services?
You can run an Execute SQL Task statement in my SSIS package. The Execute SQL Task runs SQL and checks if the tables have more than 1000 rows. If there are less than 1000 rows, you will fail the package.
24. How Do I Make A Forceful Failure Inside Of A SQL Statement?
Tasks are failed through AFAIK. Therefore, if your Execute SQL Task has a statement like so in it:
declare @count int
select @count = select count(*) from my_table
if @count < 1000
begin
raiserror(‘Too few rows in my_table’,16,1)
end
else
begin
— Process your table here
end
You should obtain expected results.
25. There Is No Process On The Other End Of The Pipe?
You can switch to Windows authentication, disconnect, and then try to log in again with SQL authentication. This will let you avoid this error. You will be told about your password expiry in coming screen. You must change your password and it should start working now.
Conclusion
The above-mentioned are the 25 best questions and answers for the SSIS interview. They are the most basic ones, which almost everyone asks but depending on the nature of work, organization, and the given role, they may vary. Good Luck!